DOCOHEHT BESOMB 



ED 190 080 

TITLE ■ 

INSTITOTION 

SPONS AGENCY 
POB DATE 
GPANT 
NOTE 



EOBS . PBT CE 
DESCRIPTORS 



/ IR 008 561 

\ 

Individualized Instruction for Data Access (IIDA) . 

cts No. 2-3, 

Pa. Graduate School of 



Quarterly Reports No. 2-3, 
Drexel Univ., Philadelphia, 
Library Science, 
National Science Foundatipnr 
78 

DSI-77-2652U 
69p.; For related docuiaents, see 
U62^ and ED 179-195, and also .IR 



Washington, 



ED ■ 145 826, 
008 562. 



IDENTIFIERS 



MF01/PC03 Plus Postaqe. * 

♦Computer Assisted Instruction^ Coapater Pilograjb 
Data Bases; *Formative Evaluation; *Indivi/aual$ 
Instruction: *lnformation" Retrieval; Relevance 
(Information Retrievals ; *3earch Strateg/esj 
Summative Evaluation 

♦Individualized Instruction for Data Access 




16 



/ 




/ 



ABSTRACT 



quarterly rep<»^ts for September and D^ cember 

a'formative evaluation; 4t the 



an 



. / 

These 

1978 are two in a series providing 

Individualized Instruction fot Data Access (IIDA) pj^ojectw' 
investigaraon of searching behavior patterns conducted to/ develop 0\ 
a model oiPgood searching procedure which could be/ used sis the basis 
for teaching new user^ of IIDA how to search, (2) /specific indicators 
which the IIDA -program could use to analyze a seatch in progress and 
determ\n3 trends of searching behavior, and (3) An analysis of 
commonly made errors and the means by which they can be detected and 
corrected by the IIDA program. Report No. 2 coi/taias a search process 
analysis (including a searching behavior stud^# diagnostic 
procedures, and a search exercise) , and a search process assessment 
(including an error analysis and an id'entif ic'ation of measures which 
discriminate Between users)- Report No. .3 examines three issues: 
aspects o^f formative evaluation planning wh,lch overlap with summative 
evaluation, plans for evaluation of the impact of the syst^em on users 
with IIDA as either assistant or instructor, and kinds measures 
which can and should be "used in "assessing t^e impact of IIDA. (FM) 



( 



* Peprodnct ions supplied by .EDPS arp the bes'*' that can be made ^ 

* fro© the original document, * 



us DEPARTMENT OF HEAL TH 
E OUCATtON & WE L F ARE 
^MTlONAL INSTITUTE OF 
E DUC ATlON 

tMi\ tXXv'VfN? HAS BM N l-M'WO- 
()> K F () ( X At T I V A', Wt ( f WM) » W« W 
IMF Pf W SON 0« CWCiANi / A T ION OR IC..N■ 
^ T INC. I T P()i N T S 0> Vif W OR OPiN'ONS 
STATf^O OO NOT Nt C^SSAWil v PW ^ 

SfNTO^»»C'Ai NATIONAL iNSTTUTf (M 

( nuf a r Ton pos- Ow POi ic v 



XHDITIfiUALIZED INSTRUCTION FOR DAT* ApCESS 

(HDA) 

Quarterly Report N"o» Z' 
September, 1978 

Drexel University,, School of Library 
and Information Science 
Franklin Institute Research Laboratories 

HSF Gl^nt No. DSI 77-2652iv 



■ PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 

■ Charles T. Mfeadow 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) " 



11 



TABLE OF CONTENTS 

* 

I, OVERVIEW 
n. SEARCH PROCESS ANALYSIS \ 
1, Searching Behavior Study 

" 1,1 Procedure 

1.2 The Search Requests 

1.3 Search Study Results 

l,i^ Pre-Search Problem Solving 
1,5 . Use of the Model 



2, Diagnostic Procedures and a Search Exercise 

2.1 Introduction ' 

2.2 ' Magnostic Procedures 
♦ '2,2.1 SearcH Structure 

2,2,2 Syntactic Analysis 

J 2,2,3 Procedural Analysis 

2.2. ^^ Conversation* with the User 

2.3 Exercise 2 
2,3»1 Program Description 
2, 3*2 Threshold Analyzer 
2, 3; 3 Warning Control ,£cagram 

2.3, U Expawl Look-up Subroutine 

m, SEARCH PROCESS ASSESSMEliT 

1, Error Analysis ^ 

# 

2, Identiflcationr of Measures which Discriminate Betlreen Users 

2.1 Introduction 

2.2 Objectives and Rationale 

2.3 Methodology 
?,3,1 Subjects 

2.3.2 Variables 

2.3.3 Search Problems 
2.3.^ Procedure 

2.** Results ^ 
2.5 Discussion I ' 

^ ■ 

IT. REFERENCES 

Appezxjix A» On-Line Searching Project Instructions 
Appendix Bi Search Requests 

Apperrilx Ci On-Line. Seat chirtg Error fclassificatlon Definitions 



t 



^ I. OVEKVIgrf 

This project represents a renewal of earlier work on IndivldiiAlizeS 
Instruction for^Data Access (IIDA) . Begun in July 1976, with initial 
funding, for one yearg the project was resumed In April 1978 and is to be 
completed in two years. This.- series of qxiarterly progress repoVts is 
jilanned to report in defjth on selected aspects of the project and to 
contain a brief, overall progress statement in each report. 

The project staff are divided into two groups. The computer group is 
concerned with the design, Implementation and testing of the requisite 
computer programs • The behavioral group is concerned with foiWtive and 
suawative evaluation of HDA, In formative evaluation of IIDA our concern 
Is with monitoring system developnent and with providing feedback aJxl 
Information for irefinement and further development of the system. In 
sumroatlve evaluation of IIDA our concern is with an assessment of the Impact 
and effeatlveness of the system arxi the^xtent ,to which the objectives of 
the project, are met; J ^ 

Given that the system can not be subjected to summative evaltaation 
until certain- basic programming chores have been completed most of our 
activities have centered around refinement^ of project di'sign. Consequently 
the bulk of this report will be focused on forrtative evaluation with plans 
TtT summative evaluation being a major topic In the next report. 

One major activity of th^ behavioral group has been the investigation 
of searching behavior patters in order to developi 

/' 

1) a model of good searching procedure which could be used as the 
basis for teachlngynew users of IXDA how to search (in the e^c^rclse mode) , 
and as a means of/ determining the searching behaviors which should lead 

•to either succesi^ful or unsatisfactory search results i 

2) specific Indicators which the IIDA. program could use to knaly^e 
a search in progress (in the asslsM:.ance mode)- and determine trends of* 
searching behavior which are likely to produce less than satisfactory 
results or ^hich are simply non-productive or inefficient j and 

3) an/analysis of commonly made errors and means by which they can 
be detect«i and -corrected by the HDA program. 

In ord^ to accomplish these objectives we have been following two lines 
of investigation. The first of these Is an in-depth analysis of the searching 
behavior ot a few professional searchers. The second is an attempt to identify 
and develop widely applicable measures of searching performance which 
discriminate among searchers with varying degree^ of experience. 

The second section of this report deal with the first two objectives 

mentioned above art! will discuss the conduct and applications of our first 
line of investigation. The third section of this report will discuss the 
second line of investigation and deals mainly xrtth the second and third of 
the two objectives mentioned above, (This latter portion of the report 
pposents the substance of a proposed research project currently Jbeing conducted 
M part of 'a Ph, D, disseWation by one of the project members,) 



1 



7 ■ n. SEARCH PROCESS ^ANALYSIS 



1. Searching Behavior Stu3y 

' Tha assumption which undsrlles thl^s^ line of Investigation is that a clearer 
tanderstanding of what a good searcher does when doing a good job of searching 
should be beneficial in refining IIDA. 

1.1 Procedure 

* 

In our first attempt at analysing searching behavior , we utilized a 
large nujubbr of transcripts of Searches obtained from local cooperating libraries. 
These searches were the results of actual user requests and varied in many 
respects, such as the search systems and data bases used, the search typics, 
and the scope ,of the search. Finding an underlying model for searching using 
these transcripts proved to be very difficult, both because of the great 
diversity among them, and because the transcripts albne did not always give 
sufficient information about the problem solving process that the searchpr was 
engaged in during searching^ • Quite- often the reason for Inputting a certain 
coDiraand or series of commands was not at all obvious and the search could not 
be meaningfully evaluated* 

To overcome th^se problems we devised a more controlled stiidy using one 
search system (Lockheed), "one data base (ChemCon •72 - •76), and three search 
requests which we selected and pilot tested ourselves. Wo as^ed nine different 
searchers, all of whom performed on-line searching as part of their Jobs, to j 
do one search each so that each of the search requests was searched by three 
different searchers. In order to provide more insight into th^ procedures used 
to solve search problems, we asked the searchers not only to conduct the search 
on-line , but- also to "think out loud" while formulating their strategy for the 
search, x 

This procedure la an adaptation of the protocol method used by Newell 
and Simon (l) to study general problem-solving behaviors. Although Newell arei 
Simon asked their subjects to "think out lovjd" dtiring the entire problem solving 
process, we did not require searchers to talk about their procediiire while they 
were actually on-line. In pilot testing we found that thinking out loud while 
doing on-line searching was very disruptive for some searchers and could lead 
to a less efficient search than they normally performed,- We fouwl, however, 
that the transcript itself could substitT4^ very nicely for thinking out loud 
when it was supplimented with the searcher*V thinking out loud before going 
on-line (during the strategy formulation phase). Furthermore, the transcripts 
became even more Informative when supplimented by the searcher's comments in 
an Interview conducted after the search was complete. 

In auamary, the procedure we used was to give the searchers a search request 
and ask them to think out loud (into a tape recorder) while they looked^ over the 
request and began to formulate a strategy for solving it. Then the searchers 
were asked to actually conduct the search on-line. Finally, we asked the searchers 
to go over the transcript when the search was complete and to verbally describe 



t 



ERIC 



th4 .'process* In partlcxilar they were asked to explain their reasons for any 
changes they had inade in their initial strategy as a result of the way the 
search progress^ • The instructions appear in Appexxiix A. ^ 

1»2 The Search Requests 

- - - ^ 

Three different search requests were used for tWiS study in order to 
determine whether the Intended or expected scope of the retrieval set would 
evoke any major differences ill searching behavior. The search requests appear 
in Appendix The three search requests- wer^ each structured differently 
to try to stimulate maximum amoxmts of particular kind^' of searching behaviors • 
Each was Intended to represent one of the types of searches described by 
Markey and Atherton (2). " ♦ 

The first request was designed to represent a fairly standard search with 
a koderate sized retrieval set (our pilot test retrieved 39 hits). The request 
involved the logical conjunction (Al^) of two sets created by cpmbinlng related 
terms (OR) and thus represented a ••building block** type of search. More 
Advanced features peculiar to the ChemCon data base, such as utiliiing tjhemlcal 
registry numbers or Chem Abstracts section codes, as searching aids could be 
used b>it were not essential for a successful search. 

The second request referenced a particular article and asked for more 
articles on the same subject. This type of request should elicit a ••pearl- 
growing** type of search in which the searcher begins" x.dth one hit and pro- 
gressively expands the retrieval set. 

The third request tried to elicit the opposite response— a ••successive 
fractions'* type of search. The request specified a fairly large and general 
subject area (one with over 500 hits) and asked for a highly limited retrieval 
set ('•a ffew major references'*). The appropriate search behavior would be to 
progressively reduce the nilmber of hits retrieved. (The members of the project 
have begun to refer to this type of search as '•onion peeling •**) 

1.3 Search Study Results 

The search flow in each of the nine searches was analyzed. The searches 
were compared with each other to look fo^ underlying similarities among the 
search methods used. Then the three searches for each request were compared 
in more detail to look for procedural differences attributable to the type 
ofasearch requested. 

The overall conparison of all nltie searches revealed some striking 
similarities in the basic search flow (See Figure 1). With relatively minor V 
variations each search followed a similar four-step recursive pattern! a) 
strategy formulatloni b) selection and combination of terms to form an 'Initial 
set J o) a decision as to the adequacy of that set (usually made on the* basis 
of vlewi^ some of the record^ retrieved by the first pass)} and d) either 
recycling, through steps a through c or printing out the results of the search. 
At this level of analysis the basic search flow would appear to be general 



ERIC 



* • 

h- . 


4 

formulate s^trategy 
• 


TT 




V 

f 


select search term(s), 
, cdmblae or limit as 


III. 


* # 


1 


. / 


'display records retrieved 


> 


'1 


• 



no 



1 




this decision can be based 
on any, or any combination, 
of the following decisions. 
In any order: r 
.a. are Che records relevant; 
enough; 

b. is .the scope the 
retrieved 'set broad enough; 
'c. is the scope of the 
retrieved set narrow enough; 
d, have all relevant terms 
•'been selected? * 



yes 



IV. 



type out results 



Figure 1. Generalized search flowchart,/ 



ERIC 



# 



not only across searchers and search types but also across search systems. 
For example, in an article on the use of the SDC system, Morrow (3) presents 
an essentially identical flowchart • ^ 

Analysis of 'the differences in search behavior as a ftmction of search 
request types has so far revealed few "differences beyond ^hose expected. The 
first search request, requiring a "building block** approach, successfully, 
elicited this type of behavior. It was also, apparently, the -west- comuflLex 
type of tl^e three. ^ Furthermore it prompted the most rigorous pre-search 
behavior. All pertinent searching aids offered by CHem Abstracts on-Mne and 
off-line were utilized by at least one or more of the searchers' to determine 
appropriate search terms. These included synonyms and registry numbers for 
the compount^^ named, and section codes and standard terms or abb'reviatiohi 
^ed for the analsrtical methods requested. Once the pre-search sources had 
booh exhausted* and a satisfactory list of terms compiled the actiial search 
was veryxstra^htforward. Terms were selected and related tiarms wer* OR*ed 
together to form two sets. Next these- two sets, representing the two major 
concepts of the search, were AND*ed together to form the final set. Two of 
the three searchers repeated the process (See Figure 2), In one case the 
iteration was triggered by an unexpectedly lo^^number of hits for a_term. 
Upon displaying a few of the records from the initial set (step^^T, the 
searcher discovered a preferred term which he then selectedvai>T combined with 
the other termS-to achieve a final set that seemed satisfactory. In the second 
case' the searcher began with only the most basic terras (a section code number 
for analytical methods , acd^ registry numbers for the two major compounds 
named) to see if this sii^^e strategy -woiild ssuffice before trying a more 
elaborate one^ On receiving a var^Ksmall set he displayed a few records 
and noticed that, the section code was retreiving some records on methods other 
than the ones requested,. He then reiterated^ by-expanding the compound set 
to include synonyms'and by restricting the methods set by entering keywords 
foi( the methods actually requested, , 

The. seoonl search request was not as successful as the first in eliciting 
the type /of search ("pearl growing**) it was designed to demonstrate. One of 
the three searches was sufficiently poor (owing to the searcher's lack of 
familiarity Mth the topic) that it has resisted analysis. The other two 
searches were ^most identical, using keywords Trim the tktle of the '♦seed'* 
article to produce a set of o-^her relevant dociaments (Seal 'Figure 3)« On» 
of these two used a more restrictive combination of keywotds and^ "upon 
retrieving a very small set, sought to enlarge ^he set byVrecomblning thenf 
in a less restrictive way. The second search was Judged by the searcher to 
be adequate after the first pass and no iterations were made, 

• 

^ The third search request, like the flrsl^ elicited the expected searching 
behavlcr. All three of these searchers began by selecting the general term 
specified in the search topic and then proceeded to limit this set in 
iFarlous ways until , in the searcher^s estimation, the set size was small 
enough to fit the requirements of the request (See Figure 4) ; 

l.k' Pre-Search Problem Solving 



III. 



(1) 



strate^: use 
registry numbers, 
general name ;f dr \ 
compounds, key words 
for methods 



\ 


/ 


select and combine 
terms 


A 





display^ hits 



(2) 



Strategy: use 
section code for 
tnethods , registry 
numbers & general 
name for compounds 





/ 


select and combine 
terms 


\ 





display hits^ 



(3) 



Strategy: use 
key words f9r 
methods , general 
name & common names 
for compounds 







select and combine 
terms ^ 


> 


f 



display hits 



ERIC 



II. 



III. 








/ 




/ \ 


uSe CA abbreviation 
for method name 




use synonyms for 
compound names, key 
words for methods 


^ \ 




« 


f 


select & recombine 




select & 


recombine 










& 



set 
adequate 



type out results 



Flgnra 2. Flow^ charts for search request 1, 



7 



(2) 



* w 



■' I. 



II. 



III. 



I. 



II. 



III. 



strategy: use 
key^^ords from title 






select multiword 
term 




• 



IV. 




N 


/ 


use s^me keywords, 
recombine (AMD) 




/ 


select fit recombine 


/ 

5el 

<^adeqi 
N 


jate 

y 


type out results 
* 



ERIC 



strategy: use 
keywords from title 


* 


/ 


select multiword 
term 


' N 


/ 


display hits 








type out results 



FigTure 3» Flow charts for search request 2. 

^ lit 



I. 



II. 



IJI. 



I. 



u. 



III. 



IV. 



ERIC 



(1) 



Strategy: use 
general term*, limit 
to reviews & English 
language 







# 

expand term, select 
term, lim'it to 
, reviews & Eng» lang. 






display hits 




use first 20 hits 
of set- (most recent) 










/ 




\ 




type out results 
J ■ ■ - 



(2) 

strategy: use 
general term, limit 
to reviews 



(3) 



/ 

• 

N 


i 

/ 


select term 


N 


/ 


display hits 








/ 


limit to reviews 


I 




select reviews, 
^ combine 








strategy: use ^ 
general texm, litni^ ' 
. as needed 




/ 


select term, limit 
to* titles^ English 
; lang. & reviews 


$ 

<* 






type out result^ 



Figure if. Flow chAtfc for search request 3* 



Another area of search behavior analysis wa are attempting with the 
experimental searches is an attempt to construct problem-solving graphs 
for the searchers • pre-search strategy fonmilation phase similar to those 
developed by Newell arid Simon (l) for some of the problems which they stixJied. 
this procedure involves developing a graph of the mental steps taken by the 
.solver in trying; to solve a problem^ This includes '^baok-tracking" steps 
where the solver backs up to a previous point in his solution after realizing 
that the path he is following will turn out to be fruitless. HoJ)efully this 
will give us further insight into additional ways in which IIDA might aid^ 
novice users in attacking a search problem^ Wo have, hovrever, Just begun 
this analysis and have no results to report at this time»s^ 

t 

1»5 Use' of the Model 

Havinf: developed a general model for the search process (Figure 1), we 
then put the model into use; Consistent' with ^our first goal for thii segment 
of the project, we incorporated the model into the framework of the e3^;ercise 
mode of the IIDA program. Users of r the exercise ► mode are taught to formulate 
strategy, to select and combine terms ^ to display ^ few items from the 
retrieved set and make a decision, based on the relevance of the records in' 
the set, on whether and how to reformulate the search, and then either to 
reiterate the above ^ steps or print out the complete set and log off# 

The model also played a role in the development of the diagnostics IIDA * 
will \ise in the exercise and assistance modes to determine whether a search 
is progressing well, arid if not, what . the nature of -the difficulty might be^ 
The cycle -analysis'* described "belqw began in ^rt by going to the individual 
search flow outlines tto look for more specific indicato?fl^hich cotild point 
to problems in ri&ching a successful conclusion to a search, i 

Each gf the nine searches was examined in terms of some very simple, 
measurable parameters— -such as the number of corcnands of the same typo used 
in sequence, the number of commands in a cycle of select-combine t3n:)e 
eommaxiJs, the number of cycles in the entire search, ^r^ so on— to se^. 
if any of these parameters could be used to prive an indicajtion of how 'well 
the search waS progressing! It was fourrf that several of these meas^^h^^fi^^ ^ 
could be used to detect poor searching behaviors • This' information was then 
incl\3ded in a set of ••rules** which, when broken, will trigger a response 
from IIDA, . * ' 



7 



2. Diagnostic Procedures and a Search Exercise x ^ 

2.1 , Introduction ^ < . ' 

'■ ^ ^ ■ • 

.To review briefly th£_jaLl^^ ^^r instruction and assistance to IIDA users 
(4), IIDA will operate, in either of tw<> modes: exercise and assistance. In 
our original concept, there was a. tutorial mode, intended to precede the -other 
two/in sequence *of use. It would have provided basic instruction in searching* 
Since our revised goal is tg ptove the concepts of itOA, rather than to 
operate a commercial service, we have omitted the tutorial mode, because 
of it^^ -exjiense' andN^iir instead prcivide wKat basic instruction is needed to/ 
0ur, e^perlmeatal subjects by conventional techniques. 

In exercise* mode, IlbA. reviews the basics of searching and giv#s the 
student user an opportunity to work on a simple search, using a limite'd subset 
of the. full user language.' Exercise mode contains three exerciaes. Exercise 
1 reviews^the basic DIALOG search commands (BEGIN, EXPAND, PAGE, * SELECT, 
COMBEJE and .TVPE) and introduces the- user to IIDA's-HELP facilities. The 
user enters commands ^>i;actly as he i£^ told'||o; he has no discretion. The 
purpose of the Exercise is to show him the effects of use of commands. 

Exercise 2 restricts the user to the same commands introduced^ in exercise 
I, but he. has moroi^ freedom in using themr as he s^es fit while carrying out a 
seatch as6igne5& to him by IIDA, In doing this he will get experience in doing 
a brief, but actual, search and will begin to get familiar with IIDA*s 
diagnostic procedures, * 

Exercise 3 introduces some of the more advanced .commands and techniques. 
We again -revert to a style in which the user is shown the effects of various 
usages, but there is no complete search to perform. 

Assistance mo4e^ which is, in effect, exercise 4, permits the user to 
perform any search whatever, in ERIC or NTIS, using any valid DIALC^ comn^nds, 
IIDA monitors progress and, when either invoked by the user or when it 
decides for itself that the user needs assistance, IIDA tells what problems 
it has detected and offers a variety of HELP services. The natur.e of the . 
help available ranges from definitions of commands to advice on how to 
proceed, to an opportunity to begin again or review in exercise mode. 

In our previous report (5) we provided a description of the first 
exercise. In this ^ection we discuss the diagnostic ]procedures to be used 
in the second exercise and provide detailed specifications for that program. 

It is our expectation that Exercises 1 and 2 will be completed and.iaade 
operational by the end of November, 1978, Exercise 3 and the assistance mode 
programs will be completed by March, 1979, ''Completed,** in both cases, 
loeans ready for testing. The nature of the IIDA system is such that verifying 
that computer programs perform as specified is not sufficient. It is also * 
necessary to try them out with users to verify that they were designed to do 
the right thijpigs. Further w^ expect there to be a number of adjustments in 



t 



11. . ^ • 

♦ » 

. •■ - ■ . 1 



V 



^ thp values of control variablesf is a resu-lt' of tasting. Upon completion of 
* tests witK the cbmplete- system, we^expect another round of revision before 
declare the programs to be ♦•'ready ^ • 

. Assistance, mocie is^ somewhat like Exercise 2, but differs in two 
/ llnpdrtrfht respects; the latter permits users tp make use. of only a limited 
' numbed of D^ioG comniamJs and the diagnostic, procedures are less, sophist icated 
^♦Diagnostic pro9^uVe^ foirvExercise 2 have been specified in tertn§, of 37^^ ^ 
^.^les (an exai^ple of which is that if ^the us^ has created three consesjgjKive 
null sets this fact is-brought to his. attention, both to give him soi^e 
iafoi^aation abbut^what is wrong ahd to, indicate that something wrong). 
We. anticipate )Jthat assistance mode -will require on the order of 100 similar 
V Vulesf. 



9 



f 2.2 - DleRnostic- Procedurgs ' * * 

'The procedures 4iscu!sised below will^ be implemented in stages, and all 
require, testing to verify their utility or to determine the appropriate levet 
6f ^aripu6 thresholds or parameter? involved in their use. Some will be 
implemented ^dur ing the ensuing quarter (Octobe'i;' - December 1978) and some in 
the* next quarter. Those planned for implementation prior to the end of 197b 
are identified as such below. 

, 2. 2 »^ /Search Structure 

Althoiigh. work is being done; within this project, «i a study of a general 
model of the search*" process, we do lnot ever expect to arrive at a point where 
searching becomes so routiaized that, given an .analysis of an incomplete 
r^cprd of a^jl^rson's search, either IIDA*or any human could prescribe exactly 
what is to be done to complete it successfully. We must continue, then, to 
deal' with somewhat; hazily .defined measures and with heuristic procedures. 

^ A search consists of a sequence of*' commands. * Commands are classified, 

in IIDA, by type\ according to a;scherae depicted in Figure 5 and based upon 
one suggested by Penniraan (6). For the most frequently used conmands, the 
rationale is that BEGIN/ END and LOGOFF commands all perform a similar 
function: they delimit the boundaries of a search. The commands EXPAND and 
SELECT retrieve infonaation about individual descriptors or phrases. There 
are several variations on each command and they are subclassif ied as shown 
in the Figure. SELECT, >6f course, results in creation of /a set based, upon 
a descriptor, while EXPAND only provides information about that descriptor. 
COMBINE is a command that operates upon sets, not upon individual terms, 
hence it is a different type. It results in the creation of a new set by 
some Boolean combination of previously defined sets. TYPE and DISPLAY are 
virtually identl^cal commands used to cause records or portions ^.of the to be 
displayed on the user's terminal, a printer in the former case, a CRT in the 
latSer. The purpose of issuing either command is assumed to be browsing: 
to look through a sample of records to determine whether a set has met the 
user's requirements* In some cases, one of these conmands could terminate 
a search,^ but often, upon finding a likely looking set while browsing, the 



ERIC ^ 



12 



c__type_maj c_type_min 



• 0 



L 
2 
2 
2 
2 
2 
2 
2 
2 
3 
3 
3 
3 
3 



L 



2 
3 
4 
5 
8 
0 
1 
3 
4 
5 
6 
7 
8 
0 

> 1 
2 

3 

3 



Coosuand 

BEGIN 
.FILE 
END . 

r 

• END/SAVE 
END/SDI 
LOGOFF 
LMtALL 
EXPAND 
EXPAND 
SELECT 
*tSELECT 
SELECT 
SELECT 

' SELECT 
PAGE 
^ COMBINE 
COMBINE 
COMBINE 

. COMBINE 
LIMIT 



comcaents 



yields a segment from alpha index 

yields a segment from thesaurus 

single descriptor cmly 

single descriptor E-table 

multiple descriptors ffom E- table 

contains an Infix 

term is truncated 

used in context of EXPAND 

all operators are "AND" 

all operators are "OR" 

mixed "AND" and "OR" operators 

(same as above--distinction to be determine.' 



Figure 3. Classification of DIALOG commands into 
major and minor types. 



ERJC . 



13 



c__type_maj 

• / 

4 
4 
4 
4 

5 . 



5 
5 
6 
6 
7 
7 

• 

7 



c^typejain 

Q 
1 
2 
3 
8 

1 
2 

0 • 

1 ' 
0 
1 
2 



Conraand ccxnments 

TYPE ^ with set argument 

DISPLAY with set argument 

TYPE with accession # argument 

DISPLAY with accession # argument 

PAGE used in context of TYPE/DISPLAY 

PKINT 

PRINT contains «ort fields 



PR-* 
EXPLAIN 
DISPLAY SETS 
.RECALL 
.EXECUTE ' ' 
.RELEASE 



* cancel print command 



Figure 3. Classification of DIALOG commands j.nto 
> major and minor types (Continued).. 



ERIC 



lb 



14 



user jh^ants an exhaustive print of ^the entire contents of the set and this 
is dprie with tme. off-line command PRINT, resulting in the computer] listing, 
hp ing mailed Ao . the utfer.' These coranands constitute command types 1, 2, 3, 
4 an<^ 3. / *T}$^es 6. an^ 7 are infrequently used» 

We d^ine a st^ ing to be an uninterrupted sequence of conanands of the 
same type^ Thus, four cdnsecutiye COMBINE commands is a T^pe 3 .string of 
length f/ur. A sequence such as SELECT, EXPAND,. SELECT, EXPAND, SELECT is 
strirtg of length 5 -- all of the commands -are of Type 2, even if 
different commands, just as END, LOGOFF or END, BEGIN are storings 



a Type 
they 
of Type 




length 2, even though there -are two different ccxnmands in each. 



In virtually all of the searches we have examined, and also as reported 
^others (2, 6, 7), a search consists of a sequence of strings appearing in 
iiicreasing numerical order as we have^ classified ^them. That is, a search 
light begin with a Type 1 string, followed by a Type 2, 3 and 4 in order. 
-^Then there might be another sequence of strings, beginning with a Type 1 or 
2 string, and again proceeding to a Type 4 or 3. JThese cycles may continue 
over an extended period of time. We define a cy^b to be a sequence of 
command strings such that, by our numbering sysiern the^ string type increases, 
as the sequence proceeds* A string of a type -lower than its predecessor begins 
a new cycle. For example, if a search consists of th^ following command typfes:. 



I. 
2. 
3. 
4. 
5. 
6. 



BEGIN 
EXPAND 
SELECT 
SELECT 
COMBINE 
TYPE 
1. ^SELECT 
.8. COMETINB 
9. TYPE 
10. PRINT 



We consider this a two-cycle search. The first cycle starts wit;h*the BEGIN 
command (#1) and ends with the first TYPE conroand (#6) . The next command ^ 
(#7) is SELECT, which is of a lower type code than TYPE, hence the string 
and cycle both end and a new cycle begins with this SELECT coninand. As a 
generality, experienced searchefrs are parsimonious in terms of string 
length and number of cycles, but this alone seems not sufficient to 
discriminate between a well performed and a poorly performed search. 
We do, however, use these measures as indicators of other, more specific 
faults in a searcher ^s performance. » 

2^2.2 Syntactic Analysis 

The complete logic of the IIDA syntactic analysis i? presented in an 
earlier report (5) . In the context of overall performance analysis, syntax 
analysis is done in order to detect specific mechanical errors .which must be 
corrected. That, in fact, is the only meaningful definition of error in 



ERIC 



1/ 



er|c 



15 



our seiirch analysis: that an error is something which must be corrected ^ 
while other performance aberations do not necessarily have to be, A 
syntactic error is a fault in the composition of. a command such that it, ^ 
will not be executed by DIALOG. 

' \ 

There is little that can be done upon discovery of a syntactic error 
other than to inform the user of the fact and request its remedy. IIDA can 
add information concerning the frequency with which this kind of error has 
been made by this user and it could, although it is not now so designed, 
impose information and exercises upon the user when he errs too much. 

In IIDA, the. detection and resolution of syntactic errors is done 
separately from detection and resolution of other faults, as shall be 
explained below. 



2.2.3 Procedural An»^lysis 

Once a x^oninand has been determined to be syntactically acceptable, we are 
concerned with its productive use and here we can only rarely say that a 
^ commahd is absolutely in error, and rarely can we tell a user exactly what 
to do about it. Hence, detection of procedural errors is all probabal istic 
and their remedy is all heuristic. 

Procedural diagnostics ^lre performed in the areas listed below. Those 
not being implemented for Exercise 2 are in parentheses and are, marked with 
an *♦ _ ^ 

1. String and cycle statistics, used both alone and as 
indicators of other problems, ^" . ' • 

% • 2, Repetitions of commands, both literal repetition and 

^ ••essential" repetition; such as COMBINE 1 AND 2 and C 2 * 1, which 
^ have exactly the same effect. 

3. Use of descriptors--checkirtg whether descriptors whose use hAs 
generated null sets were checked in the thesaurus (and whether particular, 
descriptors appear to be involved Ign an unproductive COMBINE string.*) 

4. Sets created: number of null sets» unused (i.e., not referred 
to) sets, .^6e of null sets in COMBINE or TYPE commands. 

5. Thrashing, dwelling and cohvergence- -these related concepts are 
defined in more detail below. They refer to repeated' behavior in the 
formation of combinations of sets which leads to unproductive results. 

6. Browsing (the searcher *s behavior in selecting records to 
be viewed, possible repetitious selection and use of display fdrmats.*) 

7. Relevance the searcher is asked to make a relevance 
assessment of every record he has display^ to him, and these judgments 
are reviewed by IIDA to determine whether there seems to be progress 

/ toward attaining an acceptable set. 

Thrashing, in general, is a pattern of rapid shifting of search direction 
on the part of the searcher and is probably indicative of his not taking the 
time to follow through on any one idea. Hence thrashing is considered to bt 



/ 



i6 



16 



unproductive.' As a formal definition, thrashing requires a string of 
COMBINE coraraands'^of length L and an average value of the similarity index 
among the commands of less than Si. L will be initially set at an arbitrary^ 
value, but- must eventually be se^ by experience. The similarity index 
computes a measure of similarity^ of the set of descriptors used in COMBINE 
commands* S is computed as 

where : 4 is the similarity index between two commands, i and j 

^ dj^j is the number of descriptors in cocnion to commands i and j 

^ ' is the number .of descriptors used in command i 

dj is the number of descriptors used in command j 

The jpomputation of a similarity index ignores the boolean operators used, 
and is concerned exc,lusively with the descriptors used, the average similarity 
index value for a string of desctiptors is the mean of the values of similarity 
between successive pairs of coniaands in the string. 

- Dwelling is a behavioral pattern opposite to thrashing. It represents 
a mdde of use in which the searchet do^es not make significant changes in his 
searching patterns, but instead tri^s again and again to create a set which 
is only a minor vari#ant^of previously defined sets. Typically such a searcher 
is probably trying to 'refine a set beyond the sensitivity of the search 
language, or data base .to distinguish between similar definitions. Formally, 
dwelling is said to^'occur when a combine string exceeds length L and when the 
average similarity index value is greater than S2» ^^^^ is the requirement 
that the commands ,be similar. 

When a person is dwelling, ^crel:ting a set of closely related sets, it is* 
also of interest to note whethei: he* is, in fadt, making any progress toward 
his stated g6al. At the beginning of a search IIDA will ask a searcher to 
identify a goal, in braadly quantitative terms, e.g., a single good reference, 
a few good 6nes„ or an exhaustive bibliography. This goal will be taken as 
a numeric gdal^y IIDA and the sizes of- successively created^ sets will be 
compared with this goal to determine whether, on a purely numerical basis, 

the user stfems to be nearing his goal. We identiify five conditions: 

' i 

^ * * r*Set sizes are increasing toward the goal. • 

--Set sizes are decreasing towarci the feoal. 
--Set siies are increasing away from the goal* 
--Set sizes are decreasing away from the goal# 
--Indeterminate (i.e.*, the direction of movement is 
too ei^tat.ic "or there is no movement.) 



When a searcher has been detected dwelling, this convergence information 
can be of extra help in pointing out to him what the effect has been. 



2.2.4 Conversations with th€t User 



17 

Recalling that all but syntactic errors are not strictly classified as 
absolute errors, and that the IIDA programs cannot comprehend what is in the 
mind of the user, the objectives of the IIDA user conversations are: a) to 
describe problems detected by IIDA, b) to induce the user to confront them 
directly, and c)' to provide hints on possible courses of action when possible 
If .these objectives are accomplished then we can reasonably .expect the user t( 
solve his own problems* 

IIDA works under certain restrictions. It is necessarily reactive, i»e^ 
IIDA can only react to* steps taken by a user; it can not intuit what needs to 
be done and do it for hiia. All messages except those reporting absolute 
errors must be ignorab-le, i.e., they must be so worded that the user sees 
thera as advisory^ in nature and knows that, if he feels secure in what he is 
daing, he may continue in the direction he has been going. The IIDA system 
must also be inoffensive. We do not want users to feel any conflict 
between themselves and the system or that the system is "behaving" in a 
haughty or patronizing manner. We want it to state potential problems in a 
straightforward, unemotional way. Finally, even where the user is making ^ 
repeated errors, we do nc^ want IIDA to have the appearance of naggiug or 
badgering the user. Principally this last constraint means that at times 
we will suppress a fault- indicating message rather than be repetitious or 
boring. 

A message control program has been developed within\p?DA to meet the 
last of these requirements. Suggested by the TASK MONITok of NLS-SCHOLAR (8) 
This program looks over the user's fault history and decides how to react 
to the tptality of student performance, rather than just his last command. 
The others are largely met by the tone in which mejjsages are written and by 
the decision never to cut a user entirely off the system unless his errors 
are such- that he can not logically continue. ^ 

Whenever a fault threshold is exceeded, or a fault triggered, the 
appropriate diagnostic program reports this fact to a central Warning 
Control Program (WCP) . It is possible that any given user cocnmand can 
trigger several warnings. The functions of the WCP are to decide to 
send messages, not send messages, step up or strengthen the severity of a 
message, or to add connecting phrases between multiple messages, as 
appropriate at any given time. 

More specifically, the WCP is given a list of all faults triggered 
following any given command • It also has available the history of 
t>reviously issued warnings. By scanning the current list of faults and 
recent previous ones, it can decide to: 

1. Transmit a fault message as originally written. 

2. Defer a message if the same fault has "recently" odcurred and 
the user has been notified. The definition of "recently" will have 
to be determined experimentally, but, for example, we assume that if 
at coainand rr he has been told he may have issued too many consecutive 



"18 



SELECT ccxnmands, and if he issues one more such comraand at n + 1, 
^tliere is no' point rebuking him. He should be allowed to finish his 
r thought and follow it through without being badgered. Whenever a 
jmessage is deferred, a record is kept of that fact. 

3. Suppress a message if anotha^^^essage covers the same fault but 
is more specific. For example, too many consecutive COMBINE commands 
is a general fault. Dwelling is a more spepific fault of the same 
general type. Repeating commands is an even more specific fault. A 
history is not ,kept o^suppressed' messages, because the essence of the 
message would have been sent by another message. ^ 

4. Step up a message if a previously deferred fau#t has recurred. Thus, 
if command n exceeded a threshold for length of a string, and comraand 

n 4 1 continued the string, we would defer a fault message at n 4 1. 
If the pro1)lem continued, however, sooner or later it would be 
necessary to resume sending the deferred message and, when that happens, 
its strength siiould be enhanced by some phrase as *'This has occurred 
m times since the^last warning." How m^ny times a message should be 
deferred is also to be determined experimentally. 

Finally, a minor function of the WCP. is to insert connecting words 
between fault messages* 'These, not yet designed at the time of this report, 
may be as simple as to idd such phrases as **and also*' between fault-describing 
messages. 

If time pepnits, we hope td experiment with whether we can have the 
WCP respond differently to different patterris of user behavior. Thus 
certai|[i kinds of faults might make the system more stringetvfe^ others less 
so. y ' • 

syntactic error messages are not subjected to the -same kinds of 
analysis as proCibdural fault messages, because we feel that each syntactic 
error must be brought to the user's attention. We might use the WCP to add 
reminders when particular patterns of repeated error are detected. 

2.3 Exercise 2 



As indi(::ated earlier,' this exercise will familiarize the student with 
IIDA diagnostics. The student will b(^ presented with a search problem and 
the suggestion of a general search strategy. Because suffixes and infix 
notations will not be used in this exercise, the statement of the search 
requirement will include terms which appear in the thesaurus. 

The student will use the same commands used in Ebcercise 1 but he will 
•be free tp use them to create search strategies of his choosing, subject to 
the following restrictions: a) the. PRINT command may not be used; b) a limit 
will be imposed on the number of citations listed using the TYPE command 
so that excessive listings may be avoided; and £) occasionally the studeht 
will be forced to call the HELP facility. 



19 



The progress of the search will be monitored by IIDA for syntactic and 
procedural problems* When a procedural problem is detected, the student is 
advised of it and may elect to use the HELP facility to ask for possible 
solutions to the problem. The stud^nt^J^ free to accept or reject these 
solutions. 

Use of the HELP facility in response to a message indicating a procedural 
problem^ whether or not the recommended solutions are adopted, will begin 
to expose the student to the techniquj^s of developing search strategy. Also, 
completion of this exercise will give the st^udent experience in doing an 
actual . search. 

2,3 »1 Program Description 

The procedural diagnostics of Exercise 2 will be performed by a 
THRESHOLD ANALYZER, which consists of two PL-1 programs. The THRESHOLD 
ANALYZER compares various maA«ures of studept performance against assigned 
thresholds and generates appropriate messages if these thresholds are 
exceeded. These. thresholds Wre written as a series of rules which reference, 
the Student Data Structure (See Table 1) for information on the progress of 
the search, check this against the assigned values, and generate codes for 
messages which correspond to the surpassed thresholds. Message codes are 
processed by the WARNING C0N1R0L SUBROUTINE v^^ich establishes a priority 
for the issueance of messages within the context of a history of previously 
issued i^essages. ^ 

2.3.2 Threshold Anaiyz^ * • 

The two PL-l programs which make up the THRESHOLD ANALYZER are called 
by IIDA as special actions. The first, or CANALYSXS program, executes those 
rules which require only a command from the student and information from 
the Student Data Structure as input; the second, or RANALYSIS program, executes 
*those rules which require A command, data structure information, and a 
response from the host database. A sample the PL-^1 code for the THRESHOLD . 
ANALYZER appears in Table 2. . 

The THRESHOLD ANALYZER rules are presented both in Table 3 and in 
narrative descriptioir form below. .The threshold values cited are arbitrary, 
although rough hand simulati6ns applying these rules to the transcripts of 
actual searches indicate that these values are useful as • initial approximations. 
Subsequent testing will lead^o substantial refinement of these thresholds. 

Ruie 1: ^' If , after a ''Successful log on with the BEGIN coOTuand, 
the BEGIN command is issued again^ the student is advised to 
refrain from further use of this command knd the current cofraaand 
is not passed to DIALOG. 

Rule 2: If the PRINT conmand is igsued, the student is advised 
that the command is illegal* 



5- 



/ 



20 



.ccctusndi. 
03 c_ 
03 c_ 
03 c- 
' 03 c_ 
03 c- 
03 c_ 
03 c_ 
03 c_ 
' 03 c« 
03 c- 



/* STUDENT riATA BASE: 

history (100) e:(t» 
toxt char (50) Vcirvt.intJy 
class fixed binr 
set fiKed biny 
typ-e-iTisJ fixed bin(7)y 
type-ffiin fi?:ed biri<7)? 
strind fixed bin* . 
£lroijr> f-ixed-" bin? 
time fixed bin (71) » 
timG_diff fixed bin (71) » 
stack fixed iiin.* 



LATA 

> 



sct^historu (98) ext? 
03 s_text fi>CGd binr 
03 s^size f ixejL^n> * 
03 s-refs fixed bin» 
03 B-rel) fixed (3»2)» 

crror^histora* (100) ext? 
03 e-.text fixed bin^ 
03 error_-ty?»e fixed bin> 
03 ews (A) bit (1>> 
03 e^connect fixed binr 



s^^ts^vieued-hi story 



(30) exty 
03 r_texfc fixed bin> 
03* set-viewed fixed bini 
03 record-format fixed bin* 
03 rccord_rari^e» 

04 first_rec fixed binr 
04 l2st_rec fixed bin? 
rv-.rt.-jt fiK'fpd bin* 
recorcJ-viewad (20) r 

04 3ccess_.nui7i chsr (8)* 
04 relevance fixed biny 
view«3V3* fixed (3/2)? 



02 
03 



03 



help—history (15) ext» 

03 helF„text fixed binr 
03 helf^-type (15) bit (1); 

table-Fointers extr 

03 sdr- (9a) offset (sdb)y 
03 dhp (100) offset (sdb)? 

.T.et_de5c based r 

03 s-desc^exp chi3r (256) verijina* 
03 s„desc-.noriri cher (256) varyin?i» 
03 .s~tci!3_type fixed bin» 
O:', r._l lift ..-,«.' f rhrr (a>» 
X03 s-.d_ri«jifi -rixed bxru 
^ 03 s-dosc W5) char (42) varyir.!^; 



OH THIS SEARCH 



arslument of commsnd 
/* 0=V3lid)r l^errpr? 2«:control 
/# set t created sel S coia 
/* type-maj of cortiifiand )5c/ 

type-min of command 
/* string number ^^/^^ 
/* ^roijp number */ 

time command ita'z entered 
/* time since last command 
/* stack number */ - 



/* pointer to cmd history */ 
/* set size */. 

times referenced 
/* everaiSe relevance of set 



/* pointer to command history 
/* error type code )!'./ 
/* error warninsi subroutine fl 
/* * of connectinsi code */ 



/* ^points to command history t. 

/t set number */ 

/t format oT records viewed 

/* ranrfe of records viewed J{c/ 

/# number of recs viewed %/ 

/* DIALOG sec * «/ 
/* user-sssidned relevance t</ 
3ver3£^e relevancQ for viewi. 

At^pointer to HELP command %/ 
/*^type of HELP called t/ 



/* set descriptors pointers 
/* descriptor histories points 



/* expanded descriptors */ 
/* normalized 'Bescriptors */ 
/* 0=none» l=prefix» 2=suffixy 
/t< suffix on limit? if presen;. 
/* * of uniaue desc in set t/ 
/t descriptors Ji^/ 



ERIC 



Table 1« Student Data Structure 

26 



descriptors-history basedy ^ 
03 desc_key char <40)> 
03 ar£l_us3£{e (^)» ^ 

,04 d_knt fixed bin* 
04 d_cirids (25) fixed bin* 



/* key to the sdb record .*/ 
/* usaste of ardument */ 
/* nuniijer of titrr&s used */ 
/t f>3rtj;cular places used */ 



• are3 (3276Z) ext* 



areg for set-desc and desc- 



exPand-dats ext> 



03 e-tsble' (50) > 

04 elt^rm char (42) varvrinsii' 
04 e-itenis char (6)> 
04 e— rt char (3)». 

03 ratable (50) » 

04 f-teriTi char (42) varyinslf 
04 relationship char (1)> 
04 r-iteiTis char (.6)r 
04 r-.rt char (3)> 



/* descriptor */ 
/* postinsJs */ ^ 
/* related terms */ 

/* as 3bdve» for relate */ 



03 e-iidar ^ 

04 t.-tiTTie fixed bin (,71)? 
, 04 e_temp (50) » 

05 t_terni char (42) vsryinsi> 
05 t-items char (6)> 
05 t-rt char (3) » 

03 exp3nded>.idx (20) > 

0*4 first-teriTi char (42) varyinsi* 
04 last-term* char jPi2) varying? 

support-dats " ext> 



03 



Q^x_ 



ind 
04 
04 
04 
04 
04 
04 
04 
04 
04 
04 
04 
04 
04 
04 
04 



data* 

C_l3St 

s— last 
e_l3.5i^ 
r-l3st 
rv-last 



fixed 
fixed 
fixed 
fixed 
fixed 



b i n y 
bin* ' 
bin > 
bin y 
bin» 



h_l3st fixed bin> - 
d_l3st fixed bin» 
exJlast fixed bin> 
rel-last fixed bih> 
ex-iida fixed bin> 
ei-last fixed bin> 
stsck-last fixed bin> 
v-l3st fixed bin> 
^_l35t fixed bin> 
st-last fixed bin? 



03 f irst-time-marks r 



/* ss 3bove» for IIDA issued r 



first term seen in table >*'. 
/* last term s^n in table >!</ 



/* command hist */ ' 
/* set hist */ 
/* error hist */ 
/* records viewed hist */ 
/* last record viewed */ 
/* help usasie history 
/* descriptor history 
/*, EXPAND table entry 
/* RELATE table entry 
MDA issued EXPAND 
e^anded index */ 
stacked command */ 
last valid command 
/* ^roup count */ 
/* strind count */ 



/* 
/* 
/* 
/* 



*/ 
*/ 
*/ 



*/ 



ERIC 



04 s-first bit (1)» /* 

04 r_first bit (1)» /* 

04 v-f irst bit (1) » /* 

04 ex-first bit (1)» /* 

04 rel-first bit (1)> /* 

04 iida-first bit (1)f /* 

04 e-r bit (1) » /* . 

T&ble 1. Student Data Structure (Continued) 



^4 



sets created */ 
records viewed ^/ 
valid command t/ 
tXPAND table */ 
RELATE table */ 
IIDA EXPAND table */ 
0=E-table» 1='R- table t/ 

/ 



22 



03 time-data » 

04 seprch^tirrie fixed bin (71 )« 
04 time_2v^ fixed bin (.71) t 

03 cit-totsl fij:ed bin? 



/* total search time */ 

/t average time? btwn comds */ 

/* total number of citations v: 



03 cycle_d3tay 

04 £iroup_d3ta 
05 
05 
05 



( 



05 



y 

slroup_5t^t fiited bin* /* 
£[roup-lendth fixed bin? 
string-data (10) » ' 
06 string-type fixed bin< 
06 string-start fixed bin? 
06 strin^-lenath fixed bin* 
06 doni-coiTi fix.ed bin? 
^roup_.reX fixed^ (3»2);r /t 



04 Slroup-rel-hi fixed (3>2)» 

03 zero-data* 

04 sero-knt-search fixed bim 
04 zero-pct fixed^ (5»2)» 
04 zero-knt-cons fixed bin? 
04 zero-knt-cycle fixed bin* 

« 

03 errolr-dataf . * 

04 error-total fixed bin? 

04 err-pct-to tal fixed <5»2)> 

04 et_knt <500> fixed binr 



•pointer to first cmd in dro«. 
* of corrimands in £(roup */ 

/* type of string 
/t pointer to first c 
/t •* cmds in strind 
dominant type of cornrnand */ 
average relevance for ^^roup 



/* zero set check t/ 

/* total * cmds zero postindfe 

percent zero comrriands '*/ 
/* cons zero commands */ 
/* zero cmds in c^ycle */ 

/* total errors */ 
/* total 4 errors */ \ 
/* pet total cmds in error */ 
count of errors by type >K/ 



03 cMnd_d3t3» 

04 ct-knt (7) fixed bin» 

04 cmd-knt fii-tecTbinir 

04 set_knt fixed bin? 

04 set_l>st (15) fixed b»in» 

03 help-data* 

04,; help-time fixed bin (71) » 
04 help-time-pct fixed (5»2)» 
04 h_c_pct fixed (5>2)» 

03 rel-data? 

04 rel-sum fixed binf 
04 rel-knt-cycle fixed bin? 
04 rel-sum-cycle fixed bin? 
04 rel-knt-set fixed binr 
04 rel-sum-set fixed bin? 

03 rep-data» 

04 rep-total fixed binr 

04 rep-knt-cycle fixed bin? 

04 rep-knt-search fixed bin? 



/)k total cmds by type */ . 
/« total valid cmds 
/* number of sets in list be3o 
/* set*s isolated in COMBINE ii' 



/* cummulative time in HELP */ 

/* pct of time in HELP */ 

/* .pct of commands qallinsE HEL 

/* used to compute averasle rol 



/% data about repetitions X'/ 
/* tdtail repetitions */ 
/* count of repetitions in cvc 
/"^ consecutive repetitions 



03 set_Tef -datar 

04 no-ref-total fixed bin* 
04 no-ref-cycle fixed bin? 



/* total references/ to set */ 
/* t^tal references in cyclo : 



Table 1'. Student Data Structure (Continued) 



23 



/♦ RULE 3 

IF STRIK.3^LEN.3TM..?_LR-T.ST_LF.-T> > = 3 t: C_T V^E .-_LF,-T> 
rOM-COMOS-LAsf. ^T.LifST; = 1 THEN 
E-Li^^ST = E_LftST ^ If"" 

ERPCfR-TEXT ':E_LMi:T;;' .= iZ_Lf=^ST!> 

EPPDR_ T VF E E_ Li=*^t:;< = 4 03 ? 

TE_SEND<3.:' = . , . ' 

ENI'?- 



RULE 4 - * 

IF StPlh4.3_LENGtH f G_LAST* ST_LAST> >= 8 & C_TYPE (C-LR'-tV 

DOH_cnM':;G_LMSTjST_LRSX> = c' THEN ItO't 
E_LA£T = E_Li=fST + i; 

ERROR-TEXT ^E.LRST) = C.LRSt! 
ERRDR_TYPE <E_LRST .' = 404? 

TE^SEKir. <4> = 'T-b; 

END* 



PULE 5 ♦•^ 

IF STRING_LeNGTH<.3_LRST. ST^LRST) >= 8'& C_TYFE ('c.LRST> 
E_LftST 5= E«.Lar5T + 1 j 



ERPDR_TEXT<E«Lft£T 
ERRDR_TYPE <E_LRST 
TE_SEND'.5> = "I'-B'f 
ENZ>« 



> - clptst; 

> .= 4 05; 



Table 2. Sample of the PL-1 code for THRESHOLD ANALYZER. 



St 



24 

A ^ : 

Rule ^Number 



CONDITIONS 


1 


2 


3 


4 


5 ■ 


6 . 


1 


8 


9 


id 








8 


8 


8 


5 


5 


10 




c Iype(c_la5t) 


1 




2 


2 


2 


3 


4 








z.ei o knt cons ^ 


















2 


2 


/-qi oJknt_cycle ^ 






■ 
















nei o knt search ^ i 






















no ref cycle ? 






















no rcf total 3l 
















- 






cy channe ^ 






















rej _knt_cycle ^ 






















re) _knt_search 






















nu ^rcf_ar.^ 






















sli _avg 


































































^;rc up_rel_hi > group_rcl (g_las t) 






















set ^sizcjdisp 




















1 
1 


rec ord fonnat(r">^ast) 






















cjf roup(c_lafit) f c_group(c_last - 1) 






















tir e.n\ffT 




















1 


cit total 






















(JoTT com 




5 




2 










2 


3 


ACTIONS 






- 
















prepare message nutnber 




2 


3 


4 


5 


6 


7 


8 


9 


lu 


force call to help facility 
















x3 






record warning 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl. 


warning cofitrol^ program 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


♦ 


















x3 


x3 



Table 3. THRESHOLD ANALYZER rules. 



ERIC 



2, 



25 



Rule Number 



i: 


i: 


i 13 


14 


15 


16 


17' 


18 


; 19 


: 2( 


: 2 


22 


23 


2^ 


■ 25 


26 


27 


28 


29 


30 


31 


32 


33 


34 


35 


36 






























L 


r ^ 


• 
















. 






















4 














■ 2 


























2 


















































— 




3 


3 


3 




























. -~ 










— ■- 








— 










5 


5 


5 








































— . 






- 












3 










• 












































5 


• 




















































0 






















































•• 


>1 


>! 






























— 


















- 




^ 2 




^L6 




























— 
















V 












— — 












































*. 
























































— 














t:3 
















































14 




















— 




































V 

J 
























































n 


<^ 


/I 

vi 


Q 

o 
















































u 























\ 












































— 























































h 














— — ' 






































t 


g 




2 


3 
































T 









































































11 


12 


13 


14 


15 


16 


17 


18 


19 


20 ■ 


n 


?.2 , 


i3 


?4 


25 ' 


?.6 ; 


11 


>8 : 


'.9 






) . . 


J y 


^ 'I 






























































3il 


xl 


X] 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 


xl 




xl 


k1 


xl ■ 


u , 








x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 


x2 




x2 


x2 


x2 




^2 : 


>c 2 : 


tc2 : 


<2 : 






x3 




x3 


x3 


x3 


x3 


x3 








—1 












1 ; 





















y - yes; tl, t2, t3, t4, t5 - threshold values to be determined; 
u - un informative format (to be determined) ; c - convergin^g toward 
goal; d - diverging from goal; s - static relative to ^oal; 
f - greater than goal but less than 1.5 x goal; g - greater than 
1.5 X goal but less than 2 x goal; h - greater than or equal to 
2 X goal 



Table 3* THRESHOLD ANALYZER rules (Continued) 



ERIC • 



26 



Rule 3: If the EXPAND coraraand occurs eight times consecutively, 
the student is warned. 

Rule 4: If the SELECT coranand occurs eight times consecutively, 
the student is warned. 

r 

Rule 5: If Type 2 commands occur eight times consecutively, the 
student is warned. 

Rule 6: If the COMBINE conmiand occurs five times consecutively, 
the student is warned. 

Rule 7: If the TYPE command dccurs five times consecutively, 
the student is warned. 

•Rule 8: If the string length of. any combination of commands is 
ten, the student is warned "and forced to call the HELP facility. 

Rule 9: If two c6nsecutive null sets occur from the use of the 
SELECT command the s^tudent is warned. 

Rule 10: If two consecutive null sets occur from thp use of the 
COMBINE cdmmand, the student is warned. 

Rule 11: If two conse;;«frve null\ets occur, the student is \ 
warned. . 

Rule 12: If thgee null sets occur in a cycle from the use of 
the SELECT command, the student is wanted. 

J • • ' • 

Rule 13: If -tJnree null sets occur in a cycle from the use of 

the C0141iINE command, the student is warned. 

# 

Rule 14: If three null sets occur in a cycle, the student is 
'warned. 

Riile 15: If the total number of null sets thus far in the search 
resulting from use of the SELECT command is five, the student is 
warned .r . ' . - . 

e 16: If the total number of null sets thus far in the search 
resulting from the use of the COMBIt>IE command is five, the student 
is warned. 

Rule 17: If the total number of> null sets thus far in the search 
is five, the student is warned. 

Rule lb: If a .null set is referenced in a TYPE coiranand, the 
student is warned. 



29 



27 



Rule 19: If •tl^ree non-used sets are created during a cycle, 
the student is warned. ^ 

Rule 20: If the total number of non-used sets in the search 
Xhus far is five at the beginning of a cycle and this number is 
not reduced during the cycle, the student is warned. 

Rule 2L* If the number of repetitions (i.e., the number of 
occurrences - 1) of a command is two or less thus far in the 
search and 'at least one repetition occurs during this cycle, 
•the student '^s warned. 

Rule 22: If the number of repetitions of a command is three or 
more thus far in A\e search and at least one repetition occurs 
•during this cycle*, the student is warned and forced to call the 
HELP facility. 

Rule 23: If the number of repetitions of a command is six or 
more thus far in the search; the student is warned and forc-cd 
to call the HELP facility. 

Tule 24: If, for at least four COMBINE commands using the AND 
or OR operators, the similarity index is less than the assigned 
threshold, the student is yarned (thrashing). 

Rule 25: If, for at least^our COMBINE commands using the AND 
or OR operartors, the siiuilarity index is greater than the 
assigned threshold, the student is warned (dwelling). 

Rule 26: If .the average relevance (i.e«, value of total relevance 
judgments/number of judgments) of documents viewed is less than 
the' assigned threshold at- this command, the student is warned. 

Rule 27: If the average relevance of documents viewed this 
cycle is less than the assigned threshold, the student is warned 
imd forced to call the HELP facility. 

Rule 2b: If the average relevance of a pre\/ious cycle is higher 
than the average relevance of this cycle, the student i^ warned* 

Rule 29: If the average relevance at this command is higher 
than the assigned threshold the student is warned (the search 
may be complete). 

Kule 30: If the average relevance at this couinand is less than 
the assigned threshold and the display format of this command is 
un informative, the student is warned. 

Rule 31: If the set size dispersion is converging towards the 
student's stated goal, the student is warned. 



28 



Rule 32: If the set size dispersion is diverging from the 
student's stated goal, the itudent is warned. 

Rule 33: If the set size /dispersion^is static relative to the 
student's stated goal, the student is warned. 

* 

Rule 34: If the time between student commands is greater than 

the assigned threshold, the student is warned. 

• • • ' 

Rule 35: If the total citation^ listed thus far in the search 
is greater than the student's stated goal but less than 1.3 times 
• the student' s stated goail, the student is warned. 

Hule 36: If the total citations listed thus far is greater than ^ 

or equal to 1.5 times the student's stated goal, the student is 
warned • 

Rule 37: If the total citations listed thus far is greater than 

or. equal to twice the student's stated goal, the student is 
warneji and logged off* 

The^e rules which compose the THRESHOLD ANALYZER may be clustered by 
^ funption into the following categories: 

a. , control for illegal commands (Rules 1 & 2) 

b. control for consecutive Commands of the same type (Rules 3-8) 

c. control for the creation of null sets (Rules 9 - iB) 

d. control for non-used sets (Rules VJ & 20) 

e. control for repetition of the same command ^ (Ru las 21-23) 

f. control for similarity of commands- -thrashing and dwelling 

. (Rules 24 & 25) 

g. central for the relevance of documertts (Rules 26 - 30) 

h. control for the dispersion of set sizes relative to the stated 

goal for a final set size (Rules 31 - 33) 

i. control for time delay between commands (Rule 34) 
K control for tital citations typed (Rules 35 -37) 

When the application of these rules indicates a procedural error, this 
program references the Error Message Table (see Table 4) and turns on a bit 
to indicate that a message should be sent. The number of the rule in 
question serves as the index to the positions in the Error Message Table. 
The THRESHOLD ANALYZER also turns on bits in the table, when n^essary, to 
signal; a) automatic log off; b) cancel current command; or c) force the 
student to call the HELP facility. 

Th^ bit configurations thus established by the Threshold Analyzer serve 
as input to the other main program component of Exercise 2, the Warning 
Control Proigram. 



ERIC 



31 



29 / 
J 



del Ol errjmess_table(50) ext, 
03 temporary_group, 



/*error message table*/ 
/*good for current^ command only*/ 



04 te^prefix fixed bin(4), - /*code far connecting message prefix*/ 
04 te suffix fixed bin(4), /*code for connecting message suffix*/ 



04 te_send bit(l), 
04 te_help bit(l)/ 
04 te_logoff bit(l), 
• 04 te^cancel bit(l), 

03 permanent^group, 
04 pe^last fixed bin, 
04 pejknt fixed bin, 
04 pe^specific fixed bin, 
04 pe_defer bit(l); 



/*turn on if message to go out*/ 
/*force user to call -help facility*/ 
/*force logoff on user*/ 
/*cancel current command*/ 

/*good for entire searqh*/ 

/*no. of last command in which this msg. issued 

/*counts times the message issued*/ 

/*number of more specific msg*, if any*/ 
* ♦ * 

/*defer message*/ 



Table 4, Error Message Table 



ERIC 



% 

2.3.3 Warning Control Program 

' The Warning Control Program is a PL-1 program which is called by IIDA 
•as a special action to establish priorities for the messages signaled by 
the ThresholdVnalyzcr • The WCP communicates with the Threshold Analyzer 
by referencing a common data structure, the Error Message Table (sec Table 
4)* This table has both a temporary and a permanent part so that some 
infomiation may be compiled on a per-command basis and other data may 
accumulate throughout the Search, The temporary^ section of this table 
is re-initialized by IIDA after each call to the Threshold Analyzer and 
the Warning Control Program, 

For each threshold value which is surpassed during a search, the 
Threshold Analyzer turns on a bit in the Error Message Table. (EMT) which 
corresi^^nds in numbered position within the tabic to the number of the rule 
which was broken. For example, if rale 13 had been broken the Threshold 
Analyzer would turn on the te_send bit in the 15th position in the table. 
Since the rule number corresponds to the number of the message to be sent, 
turning on this bit signals the UCP that message number 15 is a candidate 
for transmission to the student. 

The WCP scans the Error Message Table and for each candidate message 
signaled there the following rules are executed: 

Rule 1: If a given message has a more specific message associated 
with the same Threshold Analyzer rule> and the more specific message 
is signaled for output, then delete the send status of the generic 
message. 

Rule 2: If a message has been given recently (i»e., within the 
last five commands), the assign the defer status to it. 

Rule 3: If, on input, the status of the message is defer, then 
assign a code for the proper connecting message and assign the 
send status to the message. 

* Rule 4: If a message had been issued often (more than five times), 
then assign a code' for the proper connecting message and assign 
the send status to the message. 

Rule 5: If more than onie message is scheduled for output, assign 
the code for the 'proper connecting message. 

At the conclusion of this scanning, the WCP will have posted the final 
configuration of status codes to the Error Message Table* IIDA will then 
read this table and execute the actions indicated. 

2.3.4 Expand Look-up Subroutine 



At this time, this subroutine has not been completely designed. However, 
it will operate within Exercise 2. 



31 



III SEARCH PROCESS ASSESSMENT 

» 

!• Errqr Analysis 

The error classification appended to this report (Appendix C) was 
empirically developed frora analysis o£ over 50 transcripts of on-line 
searche|. The development was a two-stage procec.5*\ The classification 
categories were selected and deiin^dHtn the first step^"'* During the 
second step they were modified Sand refined using 45 '^reat'* , searches using 
a variety of data bases on three search systems. • 

Still, the classification should be considered a draft. It Is not 
a trivial problem to develop a straightforward, consistent aS^d precise 
clasplf ication of this type of error, and more work is needed before the 
proposed classification can be considered v^.^jtt isf actory . Although it will 
probably never be perfect, this analysis cair shed light on the type and 
magnitude of errors made by searchers, and thus, provide information which 
could be used to decrease errors, either through user education. or changes 
in system design. 

Table 5 shows the results of the analysis of the 4^8 ^'rcal'' searches. 
Fourteen percent of the commands contained errors which were transmitted 
to the syf-.tem- (excluded were error type I. A., errors corrected before 
transmission) • The average manbcr of errors per search was 3, with a 
range of from zero to 13. \^ 

A<nother count of errors was made from 40 traVscripts generated during 
an experiment in which searchers performed a group of pre-selected ERIC 
searches on the Lockheed system <See Table 6). The searchers were divided 
Into two groups, experieneed and novice searchers. In this analysis all 
typographical and spelling errors (I. A. and I. B.) were excluded so the 
results are not strictly comparable to those shown in Table 1. This data 
shows that novices make twice as many substantive errors per command as 
experienced searchers. ^ 

The proposed classification Is neither as complete nor as detailed as 
It could be. Since It was derived empirically, it contains specific, 
categories for errors which have been observed to occur with some frequency 
rather than specific categories for all possible errors. It can, however, 
serve to point the way to problem areas and to provide ordfer-of -magnitude 
data. A more detailed classification which defines errors so specifically 
that a computer can recognize them, and which relates the errors to specifi 
commands, could be developed from this classification. 

A number of apparent causes of errors have been identified. In an 
order approximately paralleling the classification, they are: 

1. Failure to type perfectly. 

2. Failure to spell perfectly. 

3. Failure to have mastered the cormnand language. 



34 



ERIC 



Total Searches: 48 ♦ 

•' ■ — ■ ^ ,^ ] 

Total Errors not corrected before transmission: 147 
(total Errors - I. A.) 

Average Errors not correJ:ted before transmission: 3 
Range: ^ to 1^ 

Total number of commands:' 1034 

Average number of r commands: 22 
Range: 3 to 72 

Average errors/ command : . 14 
Range: .04 to .63^ 



Tabl^ 5. Error Analysis of "Real" Searches. 



33 



f 



Total Searches! ^ 

Total no n~ typographical or spelling errors j 

Average non typo or spelling e^lrorsj 
Range: 0 to 13 



Total number of commands: 

Average numWr of commands: 
Range: 2 to 33 

Average errors/ command : 
Range: 0 to .39 



1 



Novices Exp. 

Searchers 
24 16 



62 



2.6 



429 



17*8 



.15 



17 
1.1 



268 



16.7 



.07 



Table 6. Error Analysis of ERIC/Lockhecd Experimental Searches 



ERIC 



36 



5i 



34 



4* Failure to pay attention to results. 

5» Failure to remember preceding commands* 

6*' Failure to understand the search logic. 

?• Failure to understand the file structure. 

8. Failure to use the controlled vocabulary correctly. 

Some of these errors could be detected fairly easily by a computer 
monitor (e.g., syntactic errors). However, other analyses, such as 
tiistinguishing a missspelled term from a controllcd-vocabulary term input 
incorrectly, may be done automatically only with great effort. Still other 
analyses depend upon the obTserver following, and making educated guesses 
about, the thought processes of the searcher-^-f or example, deciding when 
the wrong logical, operator was used, 

2. Identification of Measures \T?hich Discriminate Betv;een Users 
2.1 Int/oduction 

There is a need fof accepted and widely--applicablc measures of searching, 
performance. Presently, such measures do not exist. The goal of this 
research is to examine the feasibility of using the behavior of searchers in 
their communication with the machine as measures -of performance. That is, 
it is the ^yocess of searching which is the focus of attention. 

The attraction of using the process of searching, rather than the result? 
of searches, to assess performance is that an important segment of the 
search process can be monitored automatically and unobtrusively by the 
computer. ^Tliis is not true for search results. Nor is it true for manual 
reference searches for which the process is much more difficult to study. 

In order to detenaine the behaviors which corrclate^with performance, 
i.e., skill in searching, it is necessary to compare searches x^hich vary in • 
success. One approach might be to look at the result s of searches, and 
compare process to result ; this is a part of the proposed research, 
llowever. since the measures of results of searches which are available-- 
recall and precision--are only very rough approximations, this may not be 
the most productive approach. 

A better way of selecting searches which vary in quality would be to 
first select searchers who vary in skill. Given that there is no objective 
way to select searchers by skill level, experience level will be used, 
instead. The underlying assumption is, of course, that experience is 
strongly correlated with skill. ^ 

Thus the major research objective is the identif ifcation of the 
differences between Jike searches of users of online systems who have different 
amounts and types <^ ex^^ience. Searchers clas'sified into several experience 
categories will be asked to search four search problems. Data will be 
collected on the background of the searchers and on over 20 process and 
outcome variables. It will then be possible to perform a variety of analyses 
which will contribute knbwledge about the relationship between the search 



I 



35 



and factors which are believed to influence it, characteristics of the 
searcher* It should also be possible to relate the search process to the 
factors which it influences, the search results. 

2.2 Objectives and Rationale 

The primary objective^of the proposed study is to identify those 
techi^iques which differentiate between the searched of persons with different 
overcill amounts of experience. 

This study also has several subobject ivcs. Thfc first is to identify 
those techniques which differentiate .between the searches of persons who' 
are searching a database with which they are familiar and the searches^ 
of persons who are searching an unfamiliar database. The second is to 
identify the factors which contribute to success in searching. The third 
is to present descriptive information on errors made in s<>arching. -The 
fourth is to describe the utilization of various capabilities o^f the system. 

Of the possible methodologies that could be used to accomplish- the 
objectives, a quasi-exper irncntal design has been selected because, in terms 
of economy and feasibility, It aippears to be by far the best approach to t\\e. 
problem. In a quasi-experimental design one tries to simulate ''pure*' 
experimental design in a situation where one does not have the capacity to 
assign subjects randomly to. treatment groups. 

2.3 Methodology 

Seventy- two searchers will each perform two of four pre- selected 
searches on the Lockheed /DIALOG system using ONTAP, the 1975 equivalent of 
the ERIC database. The searchers will be selected from five groups: novice 
searchers, moderately experienced searchers with ERIC experience, moderately 
experienced searchers with no ERIC experience, very experienced searchers 
with no ERIC experience • 

Data will be collected on the background of the searchers. In addition 
over 20 process and outcome variables will be measured, by examination of the 
search transcripts. Statistical techniques will be used to identify both 
the process variables which are the best discriminators between experience 
groups and the process and background variables that best predict the 
dependent outcc^e variables. 

2.3.1 Subjects 

The seventy- two subjects will be selected from searchers- in- general 
to conform to the characteristics of the five groups shown in Figure 6. 
The novice searchers will be randomly selected from the daytime Pall of 
197a Fundamentals of Library and Information Science (FUNLIS) class at 
Drexel University. The experienced searchers will be recruited from the 
coraniunity of working online searchers. 



36 



36^ 



dialog/No eric 



Very 

Experienced 



Moderately 
Experienced 



12 
(1) 



12 
(3) 



, DIALOG/ERIC 



12 

(2) 



12 
(4) 



Novices 
24 

y 



Figure 6. Study Design 



37 



2.3.2 Variables 

The variables relaLi^ag to online searching can be divided into 'four 
types: a}, environmental variables; b)' searcher variables; c) process 
variables; and d) outcome variables. . These variables are listed in Tables 
7 through 10. The level of measurement is shown for the variables on which 
data will be collected. 

The experimental procedure is designed to control for most of the 
environmental variables. The subjects will be given two of the same four 
searches to perform on the same database and the sanic search system. ' The 
rcquestor--the researchervis the same for all the searches. ^ 

t * 

One can see from Table 8, the list of s^^archer variables judged to be 
important, that there are a. large number and types of training and experience 
might affect online systems performance. ^ 

Data will be collected on all the process and outcome variables 
listed in Tables 9 and 10 except the need fd^?«^^clp. 

2.3>3 The Search ProblciDs 

The file to be used is ONTAP (File 201) on the Lockheed/DIALOG system. 
This is a static file which contains the 1975 ERIC (Educational Resources 
Information Center) file. It corresponds in all respects (data elements, 
searchable fields, .etc .) to the regular DIALOG ERIC file for 1975 accessions. 
ONTAP contains about 35,000 references*', approximately 12% of the ERIC file. 
The ONTAP file contains ^'answer sets" for 29 searches which were (fluted 
by exhaustively searching the file. These answer sets have been equated to 
the results of a perfect search (100% recall and 100% relevance) for each 
of the 29 search topics. " , 

The prepared ONTAP searches are categorized according to complexity: 
simple, ^ medium and difficult. Four searches of medium difficulty which fit 
the following criteria were chosen for the experiment: 

1. The topic is not technical. 

2. The search is suitable for a wide variety of strategies. 

3. It is simple enough for novices to handle", and difficult 
enough to offer some challenge for the^very experienced searchers. 

4. There are more than 3 documents in the answer set, ^ 

2. 3 A Experimental Procedure 

Each searcher will conduct two searches. The novice searchers will 
perform the isearches in the Drexel Information Science Laboratory by 
appointment. , Since the experienced searchers will be scattered geo- 
graphicaljly, they will be recruited by telephone, and the data will be 
collected' by mail. The experienced searchers will be given carefully 



4(j 



38 



Variable 



Data to be' 

Collected 

On 



1. Database 

a* Specific 
b« Co^t 
c« V Subject 

2« Search Systen 

3. The Search 

•a. Characteristics of the 
" requestor 

b. Objective of the search 

c. Complexity of the search 
Subject of the search 

e. The specific request 

4. Organization 
a. Type 

b« Management attitudes 
c. Charging policy 

5. Physical space 
6« Terminal 

7. System response time 

• 8* Machine- related problems 
(other) ^ 

9, Access to search tools 

10. Presence of the researcher 



N 



To be 

Controlled 
For 



Not;. 
To be 

Measured 



X 
X 
X 



X 
X 

X 
X 
X 



Partially 
Partially 

Partially 
Partially 
Partially 



O tr Ordinal level variable 



Table 7» Environmental Variables 



ERIC 



ii 



39 



Variable 



. Data to be 
Collected 

On 



To be 

Controlled 
For 



Not 

To be 
Measured 



I, Education 

A., Undergraduate 

1. Year of degree 

2. 'Major field 

3. Minor field 

6. Graduate 

1. Yi^ar of degree 

2. Major fi^eld(s) 

Cm Other 

11 Training in sub- 
ject of database 

2. Training in math- 

ematics or science 

3. Training in library 

science 

II, Online Bibliographic 

Search Training 
A«^:Years since training 

Type of initial training 
Of.v.Cohtinuing education 

III, Online Bibliographic 
Search Experience 

A. Total experience 

1, Number of searches 
ever performed 
•2. Number of searches 
ever performed 
using a specific 
vendor system 
3, Number of searches 
ever performed on 
a specific database 
4* Nvonber of searches 
ever performed on 
a specific database 
using a specific 
vendor system 



N 



N 

N 
N 
N 



I 

N 

N 



X 
X 



Partially 

Partially 
Partially 



Partially 



I « Interval level data 
0 B Ordinal level data 
N B Nominal level data 



ERIC 



Table 8^ Searcher Variables 

. 4i 



40 



Variable 



Data to be 
Collected 

On ____ 



To be 
Controlled 

For 



Not 

To be 
Measured 



B, Current activity level 

1, Number of searches 

performed per month 

2, Number of * searches 

performed per month 
on a specific vendor 
, system 

3, Number of searches 

performed per month 
on: specific database - 
4« Number of searches 
.performed per month 
J on a specific database 
using a specific 
vendor system 

IV, Other Experience 

A. Experience with 

reference searching 

B. Experience with hard 

copy equivalent of 

database 
€• Experience with computers 
D. Typing ability 

V, Personal Characteristics 

A. Intelligence 

B, Creativity ^ ^ , 
C# Problem- solving ability 
p. Cognitive style 

E. Flexibility 

F, Age 
G« Sex 

. Hii Attitude to online 
searching 
X. etc. 



N 



0 
0 
0 



Partially 

Partially 
Partially 



Partially 



X 
X 
X 
X 
X 
X 
X 



I c Interval level data 
0 c Ordinal level data 
N «= Nominal level data 



Table 8. Searcher Variables (Continued) 



ERIC 



4o 



Variable 



41 

Data to be 
Collected 

On 



To be 

Controlled 

For 



Not ;. 
To be 
Measured 



*!• Commands used 

* (by type of cominand) I 

*2. Descriptors searched J. 

*3. Different types of des~ 
• criptors used 

Thesaurus . N 

Free text N 

Prefixes U 

Suffixes N 

*4 • Errors by : 

Typographic ' I 

Other (classified) I 

5. Errors with potential 

impact on search results I 

6* "Errors with actual impact i 

on search results I 

*?• Use of sophisticated 
techniques 

Short logic form - N 

Stacking N 

. Truncation N 

Adjacency N 

Nested logic N 
Printing in useful subsets N 

*8« Number of records viewed I 

*9« Number of sets viewed I 

*10« String/cycle analysis I 

Search rating by knowledgable 

searchers I 

♦12 • Requests for help 



r 



le Interval level data 
Nominal level " data 



♦Computer monitorable 
♦♦Partly monitorable 



Table 9. Process Variables. 



ERIC 



44 



42 



Viariable 



1. References retrieved 

2. Recall 

3. " Precision 

4. Cqjinect time / 

5. Efficiency 

6% Searcher satisfaction 
Likert scale 
Seracintic differential 



Data to be 

Collected 

On 



> 



I 
I 
I 
I 
I 



O 



To be 

Controlled 

For 



user satisfaction 



I c Interval level data 
0 «= Ordinal level data 



. *Computer monitorabl 



TaSTe^lp. Outcome Variables 



•4 



43 



worked out directions which have been pilot- tes^tcd in advance* 

Each of the foi^t searches will be performed* six tiraes by the group 
of searchers in eact^ '^e^^per iencc ccll'^* in Table 6, and Uv-clve times by the 
novice searchers. The Searches will be randomized within eaph cell. 

2>4 Results 

IT 

A discussion. of the results will be forthcoming. 
2>5 Discussion 

The ^proposed research is raultifaccted • Although the main interest is 
^ in the differences between searchers at varied experience 4cvels» the large 
body of empirical data collected in the study 6f the search process could 
be used for other purposes, particularly for designing monitors for online 
systems . ' • 

Information on the ways in which searcTfe^'^aa?i2"*"actuaJLl^ being! performed 
should be useful to system designers and educators as well. For example, ^ 
a tabulat iotf^of errors made in searching might point out system features ^ 
which cause special difficulty. Effort could then be made to correct the 
'difficulty either by the system designer through changes in the system, or 
by educators through special attention during training. 

It is expected that' a major value of the proposed research is its 
potential contribution. to the methodology of evaluating both the effects of 
searcher backgi:ound on online system performance, and the systems side of 
the interface. This is an area where there is an acute need for work. 

The major independent variables* in this study are levels and types of 
experience. Experience was felt to be most • suitable for this first effort 
because it is more .likely to cause differences in searching behavior than 
some of the other variables. If differences in behavior due to experience 
are found to be measurable, the methods developed here can be tefined and 
used to look for affects that may be more subtle. For example, there are 
a number of other important and related research problems having to do with 
the effect of subject knowledge, or of training, on search behavior and 
performance. From the systems side of the interface, there is the problem 
of* evaluation the effects of particular command languages. 



44 



IV P.EFEP£j;OES 



1, Nbwell, A., and Simon, H, A. H\?nan p-rpblen fiolvin.'-^ . Encrlowood Cliffs, 
N.J.I Prentice-ILill, 1972. 

2* Markey, K,, and Athorton, P. CNTAP t Online traln^nc^ ary^ pr^-'ctlco rrpnnal 

for EUTC dnta bane searchers > lilXlC Cloar5.np:house on Information Resources, 
Syracuso University, June, 



3t Morrow, D.I. A pem^ralizoo flowchart for the use of ORBIT and other 

on-lino interactive bibliocrraphic search sjjstoms. Journal of the Arnorican 
Society for TnTon'^ation Scionco t 1976, 2^, 57-'60. 

« 

Meadow, C. T., et al. Tn^^ividuali7.ed inntnictjon fgr d^tn r-iccess final 
. desif^n report . K5F Grant wo. DSI 76-09737, Graduate School of Library 
Science, Drexel University, Philadelphia, 1976. 

5» Indiv^dualir^ed . jngtruction for data accqf^s qijartnrly report n^inb^r !• 
IISF Grant No. DSI 77-^^>5^5, Drexel University School of Library and 
Infonnation Science, Philadelpliia , June, 1978, 

6, Penniman, Rhyt^t^n of dialoruo in hur^an-cc^pnter convnrration . 

The Ohio State University, Ph. D. Dissertation, 1975. 

?• Stariflera, 0. On-line retriovnl systRns: Some observatlops on the user/ 
systems intorface, Prooiv-KH nc-r. £f the Amorio''.n Socipty for Infon^af ion 
Science . 1975, i^, 3^-^^0. 

» 

8, Grifnot.ti, M. C, Hau.^?nnnn, C., (Pe Goiad, L. An Mntellir.ent* on-lino 

assistant ar<\ tutor — KLS-oCIlOLAR, Fv oceoH1n,-s of t^r- Kntj.onr.l C orient or 
Conferp.rco . Ancrican Federation of Information Procossing Societies,' 
Kontvale, N. J., 1975. 775-31. 



ERIC 



4 7 



45 



APFEI3)1X A 



Qp-l,ine Searching Project 
Instructions 



As part of our research on on-line data base searching, we are 
trying to get a clearer picture of the procedures that are involved 
in conducting an on-line searj^h. In order to more easily follow the 
way in which a searcher attacks a search problem, we are asking that 
you think out loud as you work through a sample search request -that 
we wl^l give you. ' 

Specifically, we would like you to think -out loud as soon as you 
begin looking over the -search request^ saying all the thoughts that > 
come to you as you study the problem arid begin to formulate your strategy 
for -solving it. It will not be necessary to think out loud wh.ile actually 
doing the search on-line. Then after doing the search, v^e would like 
you to. go over the transcript and again talk about what was going through 
your mind during the search, especially stating your reasons for con- 
ducting the search .exactly as you did, including steps which did not • 
lead to useful results, and. whatever decisions you mad6' while on-line^ 
including any alternative strategies you considered but then rejected. 
We would like to -emphasize that this is not a test of how well you search; 
we are primarily interested in how a seacch is generally conducted, and 
through what stfiges a searcher progresses in solving a request. There- 
fore^ we would like you to be very specific about what you were thinking 
at each step of the search. Other than talking about your search pro- 
cedure^ however, we vould like you to treat this problem as if it were 
one you received during your normal work situation, so that your search " 
foUows> as much as possible, the same procedure you would generally use. 

should also mention that your participation in this project will 
be kept confidential and that you will not be personally Identified in 
any written or oral communication concerning this project. 



4tJ 



46 



• APPENDIX B 
SEARCH REQUESTS 

Request #1 

The user would like to fin^ references containing gas chi^orutop^raphlc 
and/or mss sTpectronetric analyoes of nitrosar.ine conipoun^s (especially 
dimethyl- and dlethylnltrosamine, but any others that can be fourd, as 
well). 

Search the CHEIICON data base (File 3) on Lockheed and print out, on-line, 
the CA abstract numbers of all references retrieved, 

Requ-est #?. 

The user has found a paper of interest and would like to find 
references to all other work related to this paper. The paper is by David 
S. Auld in Biochom, B5ophys, Re«, Cotnm. (1976) and is entitled, «Yeast RTJA 
• polymerase I, A eukaryotic zinc iriQtalloenzyino,*' 

Search the CJEMCOIT data base (File 5) on Lockheed and print out, on-line, 
the CA abstract nunbers of all references retrieved. 

Request #3 . * 

The user vould like some background material on catecholamines. Re vrill 
be starting work with this p;roup of compounds but knows little about then, so 
he vrould like to have just a few major references that «an give him a quick 
overview of the state-of-the-art of this field, 

Searc^^the Ch'EilCCH data base (File 3) on Lockhoed and print out, on line, 
the CA abstract numbers of nil references retrieved. 



47 ^ 
APPENDIX C 

* 

ONLINE SEARCHING ERROR CLASSIFICATION 

/ DEFINITIONS ' 
/ ~ 

I • Typographical and spellincr errors 

A, Corrected before trainsraission 

Any error corrected before transmission, either in the 
search terra or in the command language. 

B. Not corrected before transmission 

An error in a command or descriptor that is not obvi- 
ously a format or terminology error. When in doubt 
use VI . ^ / . 

II, Syntactic/semantic errors 

A. Omitting commands ^ 

Forgetting to input a command 'code j i.e. selecting 
and combining terms as one would in the ORBIT system. 

B# Combining descriptors rather than sets 

In tlie COMBINE command, using full words instead of 
eet numbers 

Cm Wrong command code 

Code is valid, but is used in tl^ vrong place. 

D. Format errors 

Incorrectly-formatted commands, or, commands in which 
the codes or punctuation conventions are incorrect. 

E, Other * . - 

r 

ill . l^rocedural errors 

A.N^ommand unnecessary or repeated unnecessarily 

Repeating the same command, or inputting a command 
that gives redundant information. 

IV. Logic errors 

A* Forming a set bound to produce zero postings 

ji . ' In a logical operation, combing terms. in a manner so 

♦ . that the result is necesisarily zero postings. 



i>0 



48 

B« Wrong logical operator used 

One logical operator (AND, OR, 'or NOT) substituted for 
another • 

C« Fajiled to use already-combined sets 

Re-creation, in a COMBINE statement, sets already 
created . 

D, Wrong set number used 

Creation of an unintended set, resulting from use of 
vrong set number(s); or, use of a non-existant set" 
number . 

£• Performed unnecessar^^ogical operatioi> 

Logical operation is redundant; result should already 
be Known from previous logice^ operations. (Prefer 
to III. A.) 

F. Other 

Terminology errors 

A. Used incorrect subject term; correct term in thesaurus 

Correct term could have been found using a oross-ref- 
erence in the thesaurus. 

B. Used incorrect subject formj correct form available in 
thesaxirus 

Used for cases when the spelling, punctuation, or end- 
ing is slightly different from term in thesaurus.- 

C« Used as a descriptor term not in thesaurus 

Most frequently, this vould be an invalid multi-word 
term which would receive zero postings. 

D.* Incorrect subject term format 

Refers, to mistakes in the adjacency features or label- 
ling protocols for subject term descriptors. (Prefer 
to II.D.) . 

£• Non-subject term input incorrectly 

Non-subject term input in the wrong format. 



49 



F« Term unnecessary! vouLd be covered by anotjier term 

For example, searching for both a subject heading and 
a single-word descriptor that is part of the subject 
heading. (Prefer to II I. A) 

G, Other 

« Impossible to classify/other 

Used vhen in doubt or for all inexplicable entries. 



INDIVIDUALIZED INSTRUCTION FOR DATA ACCESS 

(IIDA) 



Quarterly Report No* 3 
December, 1978 

Drexel University, School of Library 
and Information Science 
Franklin Institute Research Laboratories 

NSF Grant No. DSI 77-26524 



5; 



11 



\ 

I 



r - 



TABLE OF CONTENTS 
I. OVERVIEW 
II. EVALUATION 

1. Fonaatlve Evalxiation 

1.1 Project Staff 

1.2 Computer Science Majors 

1.3 "Real" Searches 

2. IIOA as Assistant 

2.1 Fundamentals of Library and Infonaation Science 
User Group 

2.2 Journeyman Searchers 

✓ 

3. IIDA as Instructor 

3.1 Technical Writing 

3.2 Business Writing 

• 3.3 Cognitive Style _ . —4^4 

4. Measures for Evaluation of IIDA 

4.1 Dependent Variables . ^ 

4.2 Independent Variables I ' 
4.2.1 User Satisfaction " ) 

4.2.2*" Attitudes Toward^5^utureJJser Behavior 

4.2.3 Problem-Solving Ztft^^ 

4.2.4 Demographics i> 
III. REFERENCES . - 1 



ERIC 



I. OVERVIEW 

this project is a renewal of earlier work on Individualized In- 
struction for Data Access <IIDA)* Begun in July^l976, with initial 

funding for one year, the projject was resumed in April 1978 and is to 

<«• 

,^ 

be coxDpleted in two years • This seties of quarterly progress reports 
Is planned in depth on selected aspects of the project and to contain, 
a trief overall progress- statement^in each report. 

The project staff are divided into two groups • The computer group 
is concerned with the design,^ implementation and testing of the requisite 
computer programs* From the user's standpoint there are four major sub- 
pactions of the total system. In .the first exercise the programs lead a 
.user through a basic search in lock-step fashion, introducing some basic 
commands and providing familiarization with the general structure of- a 
search.^ ^ ^ 

The second exercise allows the user to do a constrained search. 
Although he is not free to use any search command at any time he is free 
to carry out the search pretty much as he wants. In this exercise a core 
set of diagnostic routines and rules. are used by the program to monitor 
the activity of the user and provide various kinds of feedback or assis- 
tance. 

The third exercise represents advanced search training in that, as 
ih exercise one, the user is introduced to search commands and their use. 
The additional commands Introduced here will include such things as varia- 
tions on SELECT and the shorthand notations for DIALOG commands. 

In the assistance mode (or "fourth exercise") the user is allowed to 
do an unconstrained search of his choosing. For this exercise the set of 
diagnostics and rules is to«t be expanded in order to deal with the consid- 
erably greater freedom which the user has relative to exercise two. 



By the end of February 19r/9 the fi'rst and second exercises will have 
undergone both system de-bug^ing and preliminary formative evaluation by a 
small group of computer literate users looking actively for flaws iti the 
system. With the second exercise providi.ng the nucleus to ^e expanded into 
the assistance mode, both the third exertiise and the assistance tnbde should 

be ready for use in evaluation testing in the spring of 1979 • 

I 

i The behavioral group of the project staff Is concerned vith both 
formative and summative evaluation of IIDA* Itv fbrmatlve evaluation our 
concern is with monitoring system development and with providing feedback 
and information for refinement and further development of the system. For 
example, a number of the rules incorporated into the second exercise require 
the specification of a threshold value which, ;when exceeded by the user, re- 
sults in the sending of a message to the user. These values are at present 
set by intuition or by arbitrary choice^ Presumably use of the system will 
lead to revision of the threshold values. 

In summative evaluation the concern of the behavioral group is with\an 
assessment of the impact and effectiveness of the IIDA system and with tire 
extent to which the objectives of the project ^re met. As indicated in the 
last quarterly progress report (1) the main topic for this report is the plans 
that have /been made for summative evaluation. 

^ In the body of the report that follows there is a discussion of four 
specif ic*^ issues. The first of these deals with some aspects of formative e- 
valuation planning which overlap with summative evaluation. The second and 
third have to do with specific plans for evaluation of the Impact of the 
system on users. Given the structure of the IIDA system it is possible to 
ask two major kinds of questions. The first of these is about the effects 
of IIDA when the system operates only as an assistant. When deal^g with 

5(> 



ERIC 



this issue It Is assumed that the user has previously had a reasonable 
amomt of training In DIALOG searching and engages IIDA only through the 
assistance mode* The second major kind of question one can ask has to do 
with the effectiveness of the -IIDA exercises In teaching new users how to 
do bibliographic Information retrieval. When dealing with this issue It 
is assumed that the user has had no previous direct experience with search- 
ing and utilizes the capacities oT'lIDA as both Instructor and assistant. 
Tha fourth and final major portion of this report Is devote4 to a discus- 
aion and analysis of 'the kinds of measures which can and should be used In 
assessing the* Impact of IIDA. . 



II. EVALUATION 

1» Formative Evaluation 
• 1.1 Project Staff 

One type of formative evaluatlo*^ of IIDA will begin with the avall- 
ability of exercises one and two. Project staff members will become users . 
of the system in order to have the experience of seeing what it is like 
and In order to look for flaws or ways to improve the operation of the 
system. 

1>2 Computer Science Majors 

Next a small group of undergraduate computer science majors will be re- 
crulted for the purpose of destructive testing. Because of the lock-step 
nature of exercise one, it is expected that the bulk of the destructive 
testing will be focused oh exercise two. 

On the Drexel campus there is an undergraduate organization called^ the 
Math and Computer Science Club. Contact with this organization has been 

made and several undergraduates have been recruited • These students are very 

•/ 

eii|huslastic about the opportunity to act as users and to push the system 

\ >* . ■ 

^titit|^ its flaws show. In fact part ofithe initial briefing of these users 
tfvIlL %B to challenge them td find tkthgs wrong: 
1.3 "Real" Searches 

In addition, a number of searches done by real searchers will be re-done 
through exercise two in order to look at the responses of the system to 
"real" searches. The seventy-two searches to be done through exercise two 
will be taken from the study on search process assessment described in a pre- 
vloua report (1) . 

It is at ^is stage that the real core of the evaluation work begins. 
One component part of the search process assessment study, to be described 

T* * ' 

In more depth in a subsequent progress report, is an attempt to establish 

• .- 5b 

ERIC 4 



•'quality of search" scores for each of the searches collected • These scores 
will be derived from rating scale judgments of the quality of each search 
provided by "expert" judges. 

Thl* requires setting up criteria for what constitutes a good search, 
obtaining* experienced searchers to make the judgments, and then having the 
experielrced searchers rate each search on the criteria specified. If fair- 
ly high inter-rater agreement can be obtained then this would provide an 
independent measure of search quality. 

Note that most of the diagnostic information kept by IIDA is, in effect, 
a set of measures of the search quality in that, for example,^ we would ex- 
pect that the number of times particular messages have been sent to the user 
to be related to th 6 quality of the search. If, then, the measure of search 
quality allows us tof discriminate between the just trained users and the 
more experienced users who participated in the search process assessment 
study, the relationship between the search quality measure and the kinds of 
variables measured by the record* keeping and diagnostic functions of IIDA 
can be explored* 

Thus when the seventy-two searches are re-done through the IIDA second 
exercise 've can accomplish two major tasks. The first is to explore the re- 
sponse of the second exercise to "real" searches while looking for flaws that 
need to be corrected. The second major task is the development of a set of 
criterion measures which can^be used in subsequent summative evaluation work. 
2. IIDA as Assistant 

In looking at the issue of how well IIDA is able to perform as an asi3istant 
during the search we are basically concerned .with users^ who have already had a 
reasonable degree of search training and consequently are to be exposed to IIDA 
only through the assistance mode. Ideally w^ should work with several kinds 
of uaer groups which differ in the amount of search experience and/or the kind 



ERIC 



of, search draining they have had. 

The following ^studies are described in order of priority in that we do 
not seriously expect tb be able to conduct all of the studies described in 
this report iiut we ^o expect^ to^be able to ^Anplete several and the current 
list represents our priorities ranging'^rom "must db" to "would be nice if." 
It should be also noted that the set of studies described below is shaped 
by cQnsiderat^ion of th^ available resources for conducting possible studies. 
For exaimle, the l-iaitSfed number of' simultaneous o^r^ear simultaneous users 
that can be accommodated prej^des^ doing certain kinds of studies. Given 
that users must be tested seqtientially rather than simultaneously, any user 

/ 

group considered for. this kind of testing must be one where we can have con- 



tact with various members of the group over k period of several days. 
2.1 Fundamentals of Library and Information Science User Group 
The School of Library and Information Science at Drexel University admits 
a number of new graduate students every year. When new students are admitted 
to the School they are required to register for a course entitled, "Fundamen- 
tals of Library and Information Science." One of the components 'of this course 
is a block of instruction in computer based bibliographic searching. This 
block of * instruction, totaling roughly 12-13 hours, includes both classroom 
lectures and hands-on laboratory experience in searching. The Fundamentals 
students are recommended as a group for study not only by the fact that they 
are conveniently located for easy access, but also by the fact that they are 
similar to the intended Ilb^ user in that they do not have a great deal of 
search experience and can consequently be expected to run into difficulties. 
Hopefully IIDA will be responsive to these difficulties. 

The study will be accomplished simply by adding on to the presently re- 
quired search activities a further requirement. This requirement will con- 
alat of a standardized search requ^i^t which will be the same for all students. 

^ 

b'tj 



ERIC 



Sinca some of these students will ultimately go on to become Intermediaries 
the statement of the problem will be much like those they can expect to re- 
> celve In the future except that they will be unable to Interact with the 
person submitting the request. ^Randomly, one-half of the group will be as- 
signed to conduct the search with IIDA assistance while the other half ,wlll 
t conduct the search without IIDA assistance. 
2>2 Journeyman Searchers 

Part of the search process assessment study referred to above and In 
an earlier report (1) involved recruiting active experienced searchers for 
.Study. These searchers were recurlted from the Delaware valley region and 
are employed either as Information retrieval specialists In private Industry 
or in academic libraries. This group of searchers is the population which 
we intend to turn to for a study of the effects of IIDA assistance using 
the procedures already developed In the search process study and assuming 
that^he earlier study has not exhausted the pool of -people willing to 
^ participate. * 

The basic study design will Involve two standardized search requests 
which are to be sent to each searcher who agrees to participate. Random- 
ly» half of the searchers will be asked to do one of the searches first with 
the other half being apked to do the other first. Within each of these or- 
ders half of the searchers will be asked, randon(ly^to do the first search 
with IIDA assistance while the other half will be asked to do the first 
search without IIDA assistance. 

While it might be simpler to conduct this study by asking each searcher 
to do only^one search, either IIDA or non-IIDA, a differential return rate 
on the part of one group or the other would make the results of the study 
very difficult to Interpret. With the design planned here there should not 
be a differential return rate and If there- were It would seem to be very 

6i 



8 

unllkftly that the difference would be a function of anything directly 
related to differences bevteen IIDA and non-IIDA assisted searching. 
3. IIDA as Instfuctor 

In looking at the issue of how well IIDA is able to perform the ser~ 
vice of enabllngnovice users to do a successful search we are basically 
concerned with users who have had no previous direct or instructional ex- 
perience with computerized information retifleval. As mentioned in section 

* 

2, the studies described below are constrained by available resources and are 
described in order of priority. It should also" be noted that we intend to 
do one study from section 2 and one from seciton 3 before doing any additional 
studies • 

3.1 Technical Writing 

Two years ago the "Engineering College at Drexel University instituted 
a course requirement in Technical Writing for all engineering students. Each 
term there are severa^^ections of this* course offered. We have proposed tp 
the faculty teaching this course that it would be a relevant experience for the 
students to learn something about bibliograpliic information retrieval. Many of 
tlie Drexel engineering graduates will ultimately be employed by organizations 
which utilize the services df information retrieval specialists. Presumably 
students who have had some direct exposure to searching should be better able 
to work with the people doing the searching. 

The faculty involved with the Technical Writing courses have been very en- 
thusiastic about the idea of incorporating IIDA instruction into their course 
and some have even offered us up to a week of in class time should we feel we 
need it. ^ 

For both experimental and pedagogical reasons each student will perform a 
search on a self-selected topic In each of two ways. One way in which the 

X *^ « 

search will be done is through the mechanism of learning to do and actually 

6S; 



doing a search with IIEA. The second way in which the search will be done 
is through the nonnal procesa of working wifh the intermediary in the library 
who will actually conduct the search. From the standpoint of instructional 
objectives each student will get a chance to learn something about the pro- 
cess of searching and about the process of interaction with an intermediary, 
Fdr experimental purposes half of the students will be randoiijiy assigned 
to first doing 'the search through IIDA with the other half starting off with 
» having the search odone by the intermediary. This will allow comparison ♦ 
between the searches done by students with the searches done by a trained 
and experienced Intermediary. The searches done by the intermediary pro- 
vide a 'T>ench mark'* tq be used in ^determining whether the students are able 
to complete a search with a reasonable degree of competence. It is also 
assumed that the areas of search perfiomoance where the students fall short 
of the standard set by the intermediary may provide us with some guidelines 
for improving the design of IIDA. / 
3.2 Business Writing 

The same department responsible for teaching the Technical Writing courses 
mentioned above has also recently begun to offer at least two sections per 
term of a course in Business Writing. The course is designed to be for stu-, 
dents from the College of Business what Technical Writing is for students 
from the College of Engineering* We have proposed the>k(^as outlined above 
to the faculty who have responsibility for the Business Writi^^g course^and 
have been met with considerable enthusiasm. 

While a study done with students in Business Writin/^Muld be ^nductgd 
in much the same manner as the one with students in Technical Writing we may 
want to conduct both simultaneously, treating them as a single study. While 
we would not have the option of randomly assigning students to curricula, this 



: 10 

would allow us to look at the issue of wfi'ether students from different 
dlaclpllnes react differently to IIDA. 
3.3 Cognitive Style 

One of the conceptions that- has guided the development of the whole 
IIDA system is that it is being designed for a scientifically oriented 
user group. It has been assumed that scientist and non-scientist types 
display different cognitive styles and that the scientist type of cog- 
nitive style will be more compatible with IIDA than the non-scientist 
type. Assximing that time and resources are sufficient, one of the things 
we should like to do is to administer a test of cognitive style to a 
relatively large number of students. 

Study participants would then be recruited- from this larger pool. 
IWo groups would be fo^rmed from the extreme scores and one from scorers 
In the middle range. Comparisons of the pe;:formance of the various groups 
would provide information about differences in user reactions to IIDA as- 
a funciton of cognitive style. Should major differences be found the' 
Information thus gained could be utilized in further design modification- of 
IZM in order to make the system more amenable to, or possibly more a- 
daptable to, different types of users. One major unsolved issue for the 
conduct of this study is the selection of an appropriate test of cognitive 
style* 

4* Measures for Evaluation of IIDA 

The measures which should be useful in evaluation of the Impact of the 

system are to be collected both thtough internal automatic record keeping 

functions of the system and through external means such as self-administer^^ 

questionnaires, interviews, etc* 

4*1 Dependent Variables ' ' ' *^ 

In formulating these measures it is iipportant to keep the dependent var- 

lablas In mind* In the case^of IIDA as an assistant one is concentrating on 



11 • 

The following dependent variables: 

a. Qiiallty of the search - keeping In mind that we are Interested 
not In the best possible search but «ln a sufficient search which 
satisfies the needs of^he user. 

1. ^ Product quality - recall > precis lon> and relevance 

2. Process quality 

Note: much of this will be derived from the automatic record 
keeping of the machine 
^ 3. Error rates - Internal > external reliability 

b. Efficiency 

!• From a cost standpoint 

2. Ntioiber of steps to get there 

c. Reuse of diffusion - does the "user" Intend to employ IIDA again 
and does he/she Intend to encourage others to do so. 

In the case of IIDA as a teaching Instrument one is Interested in these 
same variables as well as one which measures, the rate of learning - given 
that the indlvidvaal did not know how to search, how well is he/she currently 
doing. In other words, one would need some measures before IIDA was employed 
in the teaching mode. 

In general, it will be Important to asseble several data points on each 
of the dependent variables. This can be accomplished through the use of 
multiple searches. One is certainly Interested in the rate of inlprovement 
03^r time - to what extent does quality Improve or even the Inclindtlon to 
"diffuse" the innovation. 

4.2 Independent Variables 

Given the dependent variables described above, we will be looking at four 
classes of Independent variables: 

User satisfaction ^ ' 



4 



12 

I 

2# Attitudes toward future user' behavior 

3. Problem solving or cognitive style > 

4. Demographics ' ' 
4>2>1 User satisfactipn 

— — ^ ^ ^ 

In the are^ of user satisfaction, we will be interested in the 

following type of question(s): • 

• The IIDA search(es) that I just completed was (were) 



enjoyable not enjoyable 

very satisfactory very unsatisfactory. 

helpful in working a class assignment. not very helpful 

or problem * * 

Instrumental In working on an assignment/ not instrumental 

problem 

it 

fr\istrating to use not frustrating 

*' 

stimulating to use i - - stimulating 

characterized ^by instructions that were easy not very easy 

to follow 

Note: in e^ch of these cases of rating scale judgments oiie is tapping the 
attitudes or perceptions of the user* One has the choice of employing 
this at several points in time and measuring change or one can simply 
use it as /a summative evaluation measure* 

4*2*2 Attitudes Toward Future User Behavior ' 

IXl the area of attitudes toward future user behavior, the following 
/types of measures seem to make sense: (some of these also relate to problem 
solving style) / 

Agtee strongly Agree Disagree Disagree strongly Don*t know 
• I do not like using the computer for classroom assignments 
m Ify research is not enhanced through the use of a xcomputer 



ERIC 



13 

I would prefer to go to a librarian for bibliographic 
materials 

Intermediaries ax^ preferable to a computer system 

Intermediaries are more comprehensive than I am able to be 
with the coxnputer 

I would recommend that other people should learn bibliographic 
searching through IIDA 

IIDA is really limited to those with a background in the 
natural sciences ^ 

\ 

IIDA is really limited to those with a background in computer^ 

^ A 

One woiild also want to ask the question before a search began, wh^t 
are ^our expectations in using IIDA? Then» after tihe search was completed^ 
one can ask, were these expectations met? Comparing before and after re- 
sponses can be quite helpful in an evaluation of this kind. In this case, 
one might also consider a closed*ended question in terms of the expectations 
of an IIDA search: 

•4. 

' # assessing whether bibliographic searching is useful to solve a 
partlcluar problem 

• to learn to work with computers more readily 

• tp learn how to use this particular system 
4.2.3 Problem-Solving Style 

« We view the area of problem-solving style as being one of the most 
interesting of the independent variables. We would start out by giving each 
respondent a description of the problem-solving process as we see it. 
The problem solving process has the following stages: 

• recognizing a problem 

• defining the' probl 

« 

. . • breaking it out into sub-problems 

• selectifig one of the"*" sub-problems for "solution" 

• generating options * ' , . 



• 6y 



• selecting an- option ' 

• implementation < 
^ • evaluation 

Given this description several questions seem important: 

• at what point ih ±he^. problem-solving process' are bibliographic 
materials most useful 

• at what points have ^ou applied bibliographic searching 

• at what points can y^iT^nvision applyir^ these resources 

• where would you advise others to apply these resources 

A flifferent type.^of question attempting to measure the same dimea 

« 

a ion wovild reajd* ^ 

• when you have identified a prolDlem, how do you identify the 
Information resources that you wil3f require: 

!• relying on colleagues \ * 

2* relying on friends ^ 

^ ^. relying on a librarian 

4» relying on computer, based bibliographic materials 

4 > 2 >4 Demographics ^ • . ^ 

Finally, we need to measure the following demographic variables 

• age . . , 
* • discipline/major ^ 

• degree 

courses in the sciences, social sciences, and humanities 
(how many knd at what level) . 

t 

• previous experience using bibliographic materials 

• previou^ experience using computers 

• ctmployment background 

• future plana 



15 

III. REFERENCES 

♦ Individualized Instruction for DATA Access. Quarterly Report Number 2 
NSF Grant No. DSI 77-265525, Drexel University School of [library and 
Information Science, Philadelphia, September, 1978. 




0 



